L  K 


OF    MORTALIli: 


Bxe. 


(>.  F.  Hardy,  r^ A. 


Ex  Librts 
C.  K.  OGDEN 


THE  LIBRARY 

OF 

THE  UNIVERSITY 

OF  CALIFORNIA 

LOS  ANGELES 


The  Theory  of  the 

Construction  of  Tables  of  Mortality 


AND   OF 


Similar  Statistical  Tables  in  use  by  the  Actuary. 


A    COURSE    OF   LECTURES 

BY 

GEORGE    FRANCIS    HARDY, 

FELLOW    OF   THE   INSTITUTE   OF    ACTUARIES. 


DELIVERED   AT  THE 


Institute  of  Actuaries,  Staple  Inn  Hall, 

Durina    the    Session,   1904-5. 


;^1ublisl^cb  for  tbc  |iTStitiiic  of  ^tluaiir.'i  bn 
CHARLIES     AND     ICDWIN      LAYTON, 

56,    FARKIN(;i)OX    STKKIT,    LONDON,    K.C. 


I  909. 


PREFATOPvY    NOTE. 


.Lo  each  set  of  Lectures  delivered  before  the  Institute  of 
Actuaries,  when  published  in  book  form,  there  has  generally 
been  prefixed  a  short  preface,  or  introduction,  written  by  the 
President  of  the  Institute  then  in  office.  This  course, 
admirable  in  itself,  cannot  well  be  followed  on  the  present 
occasion, having  regard  to  the  fact  that  Mr.  Hardy  has,  in  the 
interval  between  the  delivery  of  the  Lectures  and  their 
publication,  himself  been  elected  to  the  Presidential  chair. 
It  has  therefore  devolved  upon  us,  as  Honorary  Secretaries  of 
the  Institute,  to  insert  this  foreword  in  explanation  of  a 
seeming  omission,  and  to  express  therein  the  confidence  of  the 
Council  that  the  Lectures  will  be  found  to  be  of  the  greatest 
interest  and  value  to  the  profession,  which  already  owes  so 
deep  a  debt  of  gratitude  to  their  author. 

J.  E.  F. 

W.P.  P. 
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PREFACE. 


JLHE  object  of  the  following  Lectures  was  to  deal  with  the 
theoretical  considerations  that  should  govern  the  selection 
and  treatment  of  such  statistics  as  form  the  basis  of  the 
various  tables  of  mortality,  sickness,  secession,  marriage, 
superannuation,  etc.,  which  are  of  use  to  the  Actuary.  It 
should  be  noted  that  in  nearly  all  cases  where  mortality 
tables  are  specially  referred  to  what  is  said  may  be  extended 
to  other  types  of  statistics,  though,  to  avoid  repetition,  that  is 
not  always  pointed  out. 

Some  apology  is  required  for  the  long  delay  in  the  publica- 
tion of  the  Lectures.  It  was  intended  subsequently  to  their 
delivery,  to  expand  them  into  something  like  a  complete 
treatment  of  the  subject  (from  the  theoretical  point  of  view), 
and  to  add  a  sufficient  series  of  examples  to  illustrate  the 
various  points  of  theory.  Unfortunately  I  have  not  found 
time  to  carry  out  this  intention,  but  as  regards  that  part  of 
the  subject  dealing  Avith  the  use  of  the  Pearsonian  Types  of 
Frequency  Curves  in  Statistics  this  has  been  rendered  un- 
necessary by  the  appearance  of  Mr.  Elderton's  admirable 
book  upon  "Frequency  Curves  and  Correlation *',  published 
by  the  Institute  of  Actuaries  in  1906. 

A  few  additions  have,  however,  been  made  to  the 
Lectures  as  originally  delivered,  and  where  these  appeared 
to  interfere  with  the  continuity  of  the  text  they  have  been 
relegated  to  notes  placed  at  the  end  of  the  Lectures. 

I  have  very  specially  to  thank  Mr.  (1.  J.  Lidstonk,  F.I. A., 
for  several  valualjle  suggestions,  in  particular  for  the  con- 
tribution of  Notes,  and  for  assistance  in  preparing  the 
lectures  for  the  Printers;  and  also  \)y.  Jamks  Buchanan, 
M.A.,  F.I. A.,  F.F.A.,  for  having  kindly  i-cvised  the  proofs 
and  checked  the  algebra  and  numerical  work. 

G.  F.  n. 


The  Theory  of  the 

Construction  of  Tables  of  Mortality 


AND    OF 


Similar  Statistical  Tables  in  use  by  the  Actuar}^ 


BY 

G.   F.    HARDY,  F.I.A. 


FIEST     LECTUEE. 


YV  HEN  the  Council  asked  me  to  deliver  a  series  of  lectures 
upon  some  subject  connected  with  Part  III  of  the  Institute 
Examination  I  selected  the  construction  of  mortality  and 
similar  statistical  tables,  mainly  because  it  seemed  to  me  to  He 
at  the  basis  of  our  work.  Actuarial  science,  in  the  modern 
sense  of  the  term,  had  its  origin  in  the  collection  of  statistics 
(however  rough  and  inaccurate  these  may  have  been),  and  their 
use  for  the  purpose  of  calculating  life  contingencies;  and 
although  the  Actuary  has  now  to  take  account  of  a  wider  rano-e 
of  subjects  than  formerly,  the  collection  and  analysis  of  past 
experience  and  the  employment  of  the  results  of  such  analysis 
to  forecast  the  future  is  still  his  most  important  function. 

The  title  of  the  lectures  is  somewhat  wider  and  more 
ambitious  than  the  contents  may  be  found  to  warrant.  To 
justify  it  fully  would  involve  dealing  with  many  questions  of 
detail  relating  to  the  collection  and  tabulation  of  data,  such, 
for  example,  as  the  various  methods  for  computino-  the 
numbers  exposed  to  risk  in  a  mortality  experience,  &c., 
which  have  been  many  times  discussed  in  the  volumes 
of  the  Journal  of  the  Institute  of  Actuaries  and  many  of 
which  are  exhaustively  dealt  with  by  Mr.  Ackland  in  the 
recently  published  "  Account  of  Princij)les  and  Methods."  It 
is  evident  that  to  deal  Avith  the  subject  in  such  detail,  would 
outrun  the  limits  of  the  six  lectures  which  I  have  undertaken 
to  deliver.      I  propose,  therefore,  to  confine  myself  mainly  to 
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a  consideration  of  the  general  principles  involved  in  the 
collection  of  statistical  data,  and  in  the  construction  frcnn 
such  data  of  tables,  of  which  the  Mortality  Table  is  the  best 
known  and  the  most  important,  embodying  the  results  in  tlie 
form  required  by  the  Actuary,  and,  at  the  same  time,  to  give 
such  examples  of  the  application  of  these  principles  as  may 
be  necessary  to  illustrate  the  subject. 

In  this  opening  lecture  in  particular,  I  shall  ask  your 
indulgence  if  occasionally  my  remarks  appear  to  be  of  an 
elementary  character,  as  I  think  it  desirable  that  we  should 
be  perfectly  clear  as  to  first  principles  before  going  on  to 
more  detailed  consideration  of  the  subject. 

Statistical  tables,  in  one  form  or  another,  are  familiar  to 
all  of  us.  At  the  basis  of  all  such  tables,  and,  indeed,  of  the 
whole  science  of  statistics,  lies  one  of  the  most  fundamental 
facts  in  nature,  namely,  that  all  phenomena  of  which  we 
have  any  knowledge  fall  into  certain  classes,  groups  or  series, 
and  cluster  round  certain  tj^DCs.  But  for  this  fact  we  should 
be  unable  to  classify  our  knowledge,  indeed,  should  never 
have  acquired  any  to  classify.  Speaking  broadly,  then,  every 
object  and  every  event  that  comes  Avithin  our  observation  is 
one  of  a  group  or  class  of  similar  but  not  identical  objects  or 
events,  which,  as  a  class,  is  marked  off  by  certain  special 
features  from  every  other  class,  although  the  dividing  line 
may  not  always  be  sharply  drawn.  These  groups  or  classes 
are  not  arbitrary,  but  are  inherent  in  the  nature  of  things, 
although  it  is  true  that  the  particular  groups  which  we  employ 
in  classifying  our  knowledge  are  chosen  with  a  view  to  our 
own  convenience  and  to  the  limitations  of  our  minds. 

From  a  consideration  of  a  class  of  objects  as  a  whole,  Ave 
get  a  conception  of  an  average,  or  type,*  to  which  each 
individual  in  the  class  more  or  less  conforms,  but  from 
Avhich,  notwithstanding,  every  individual  also  diverges.  Such 
divergencies  or  variations  of  individuals  from  the  average 
type  may  be  discontinuous,  themselves  running  into  types,  or 
they  may  be  continuous.  Among  the  individuals  forming 
together  the  type  mankind,  are  divergencies  such  as  those 
due  to  sex,  race,  nationality,  birthplace,  occupation,  civil 
condition,  &c.,  discontinuous  variations  producing  sub- 
groups, the  boundaries  of  which  overlap  and  interlace,  each 

*  The  type  of  the  class  should  preferably  he  cousidered  as  represented  by  the 
"  mode  "  or  case  of  most  frequent  occurrence  rather  than  by  the  "  average  "  or 
"  mean  ",  but  this  point  is  not  here  of  importance. 
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of  these  smaller  groups  again  being  capable  of  endless 
subdivision.  These  divergencies  can  be  dealt  with  statistically 
only  by  counting  the  members  of  the  various  sub-groups. 

On  the  other  hand,  there  are  divergencies,  which  Ave  may 
term  continuous,  such  as  those  due  to  differences  of  age, 
height,  weight,  income,  &c.,  &c.,  differing  from  the  former 
class  in  that  they  do  not  involve  the  separation  of  the  main 
group  into  sub-groups,  but  relate  to  qualities,  possessed  by 
each  member  of  the  group  in  varying  degree,  capable  of 
measurement  and  numerical  statement,  and  involving  the  idea 
in  each  instance  of  an  average.  Thus  we  can  speak  of  the 
average  age,  height,  or  income  of  a  group  of  persons,  not  of 
their  average  occupation  or  nationality,  although  w^e  may 
speak  of  the  average  constitution  of  the  group  in  respect  of 
these  latter  qualities. 

A  statistical  table  deals  Avith  some  natural  group  of 
objects  or  events  and  is  a  numerical  statement  of  the  manner 
in  which  the  members  of  the  particular  group  differ  inter  se  in 
respect  of  some  special  character  or  characters.  If  dealing 
with  discontinuous  variations,  as  for  example  a  table  showing 
the  occupations  of  a  group  of  persons,  it  will  exhibit,  implicitly 
or  explicitly,  the  ratio  of  the  magnitude  of  each  sub-group  to 
the  whole,  at  a  given  moment  or  moments  or  on  an  average  of 
a  given  period ;  or  it  may  take  the  form  of  a  statement  of  the 
extent  to  which  variations  in  one  respect  are  affected  by 
variations  in  another,  as,  for  example,  a  table  showing  the 
proportion  of  the  sexes  in  different  nationalities.  If  dealing 
with  continuous  variations,  it  will  either  represent  a  series  of 
measurements  of  some  quality  common  to  members  of  the  group, 
showing  its  average  value  for  the  group,  and  the  manner  in 
which  individual  values  are  grouped  round  such  average,  or  it 
may  represent,  numerically,  the  manner  in  Avhich  deviations 
from  the  average  in  respect  of  some  one  quality  A  are  corre- 
lated with  the  deviations  in  respect  of  some  other  quality  B. 

It  is  mainly  with  the  class  of  statistical  table  dealing  with 
continuous  variations  that  the  Actuary  has  to  deal  ;  variations 
in  the  ages  of  lives  under  ol)servation,  their  ages,  or  the 
periods  elapsed  since  entry,  at  death,  withdrawal,  marriage, 
superannuation,  &c.  In  such  tables  the  grouping  of  individual 
measures  round  the  average  will,  in  general,  but  not  always, 
be  found  to  follow,  approximately,  certain  well-defined  laws. 
Taking   first  the  tables   dealing  with  a  single  variable,  the 


following  may  be  considered  as  an  example.  It  is  a 
statement  of  the  heights  of  2,192  school  children,  and  is 
abridged  from  that  given  in  a  paper  by  Prof.  Karl  Pearson. 

Table  I. 
Showing  lieiglits  of  2,192  School  Children,  aged  12  years. 


Computed- 

-Observed 

Heights  in 
Centimetres 

No.  of  Children 
Observed 

Computed  Nos. 
by  Curve 

+ 

- 

(1) 

(2) 

(3) 

(4) 

(5) 

139-140 

1 

•  ■  ■ 

... 

1 

135-138 

6 

3 

... 

3 

131-134 

31 

25 

... 

6 

127-130 

107 

119 

12 

... 

123-126 

321 

338 

17 

119-122 

585 

577 

... 

8 

115-118 

618 

596 

.  >  • 

22 

111-114 

359 

365 

6 

... 

107-110 

126 

135 

9 

... 

103-106 

35 

30 

... 

5 

99-102 

3 

4 

1 

... 

Total 

2,192 

2,192 

45 

45 

Note. — In  the  formula  (col.  3)  x  represents  the  deviation  in  centimetres 

2192 
from  the  average;  c  =  7*76  and  k  has  such  a  value  — -p^  as  to  make  the  area 

c  v,r 

of  the  graduated  curve  equal  to  the  ungraduated ;  that  is,  to  make  the  totals  of 

columns  (2)  and  (3)  equal. 

If    we    consider    the    progression    of     the    numbers    in 

column  (2),  we  shall  see  that  they  form  a  roughly  symmetrical 

series,  being  largest  in  the  neighbourhood  of   the  average 

height  and  diminishing  gradually  on  either  side.     It  will  be 

seen   that   the    average    height  is  about    ^^'^\^,  the  number 

exceeding   this    height    being    approximately  equal    to    the 

number   falling    short   of    it.      In   order   to    bring    out    the 

approximate  law  of  the  series,  I  have  inserted  in  column  (3) 

the  computed  numbers  on  the  assumption  that  the  frequency 

of    a    deviation    of     -Vx    centimetres   from   the    average   is 

represented  by  the  function  A:e~^'''^%  where  c  has  the  value  7*76 

2192 
and  K  the  value  — j-=  •      The  expression  /ce~'=°/'^%  represents 

CVTT 

what  is  usually  termed  the  curve  of  "  facility  of  error ", 
or  the  "  normal ''  curve  of  frequency.  It  will  be  seen  that 
while  the  figures  in  column  (2),  are  as  we  should  expect 
them  to  be  with  such  limited  data,  somewhat  irregular,  they 
conform  on  the  whole  fairly  closely  to  the  normal  curve. 


The  "  normal  curve  "  was  first  used  to  represent  the  dis- 
tribution as  to  magnitude  of  errors  of  observation  in  ph3-sical 
measurements.  It  must  not  be  regarded  as  representing  a  law 
of  Nature,  but  rather  an  exti*emely  convenient  and  often  very 
close  approximation  to  observation ;  experience  proving  that 
in  many  cases  errors  of  observation  and  the  deviations  of 
individuals  from  the  mean  of  a  class  do  follow  very  closely 
the  laAv  referred  to.  The  formula  is  therefore  empirical  and 
not  to  be  established  by  a  priori  reasoning ;  at  the  same  time 
we  may,  perhaps,  see  a  logical  basis  in  the  following 
consideration.  We  may  suppose  that,  in  any  individual 
measurement,  the  deviation  from  the  mean  of  the  class  (as 
the  difference  in  the  height  of  any  individual  among  the 
2,192  in  Table  I  from  the  average  height  of  the  whole 
group)  is  the  result  of  an  infinity  of  minute  causes  as  to 
whose  nature  we  are  in  ignorance,  any  one  of  which  may 
produce  a  minute  positive  or  negative  deviation  from  the 
average.  These  minute  superimposed  deviations  being 
indefinitely  small  and  indefinitely  numerous,  we  may  without 
loss  of  generality  assume  them  of  equal  magnitude.  It  is 
then  clear  that  the  masrnitude  and  sisfn  of  the  total  resultinsT 
deviation  in  any  given  case  Avill  depend  upon  the  extent  to 
which  the  number  of  these  minute  positive  deviations  exceed 
the  negative,  or  vice  versa. 

If  the  number  of  possible  causes  of  deviation  is  2n, 
and  if  the  extent  of  each  indefinitely  small  deviation  is  k 
(n  being  indefinitely  large,  but  hVn  finite),  then  the 
probability  or  "frequency"  of  a  total  deviation  lying  between 

X  and   x  +  k  will   depend   on   our   having    ('i+.^r)   positive 

values  of  k  and  f'^— ^w,  j  negative  values.     The  probability 

of  this  occurring  will  be  represented  by  the  appropriate  term 
in  the  expansion  of  the  binomial  {\-\-\Y"  or 


X    \  X 

''^'■ll>-2k 


It    may   easily   be    shown    that    this    expression,  n  being 
indefinitely  great,  takes  the  form 

1    -(i)' 


■y 


,  i.e.  (Constant)  x  e  ~  Ji 


ir.i 
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i.e.,  of  the  curve  of  the  "  facility  of  error."  I  do  not  propose  to 
discuss  at  any  length  the  properties  of  this  particular  curve,J 
but  you  will  notice  that  the  curve  being  symmetrical  with 
respect  to  positive  and  negative  values  of  x,  it  assumes 
that  positive  and  negative  deviations  of  a  given  magnitude 
are  equally  frequent^  the  average  magnitude  of  such  devia- 
tions being  small  or  large  as  c  is  small  or  large.  The  maximum 
ordinate  cori-esponds  to  the  value  of  a;  =  0,  which  is  the 
average  value  of  x ;  it  therefore  passes  through  the  centre 
of  gravity  of  the  area  enclosed  by  the  curve  and  the  axis 
of  X,  and  also  divides  that  area  into  two  equal  parts.  It 
assumes  that  indefinitely  large  deviations  are  possible,  hence 
it  cannot  be  rigidly  exact,  because  when  dealing  with  physical 
measurements  of  any  kind,  indefinitely  large  errors  are  not 
possible.  This  is  not  a  practical  objection  to  the  use  of  the 
formula,  however,  as  the  probability  thereunder  of  deviations 
of  many  times  the  average  value  is  extremely  small. 

The  following  table,  showing  the  number  of  entrants  in 
various  aged  groups  in  the  0^^  Experience,  exhibits  a  quite 
different  distribution  of  the  deviations  from  the  average : 

Table  II. 
Numher  of  entrants  in  quinary  age  groups  QP^^  data. 


Computec 

I — Actual 

Central  Age 
of  Group 

Actual  Entrants 
in  Group* 

Computed  No. 
by  f  orniulat 

X 

+ 

— 

(1) 

(2) 

(3) 

(4) 

(5) 

20 

431 

436 

5 

25 

1,273 

1,305 

32 

30 

1,526 

1,473 

... 

53 

35 

1,269 

1,265 

4 

40 

914 

930 

16 

45 

591 

604 

13 

* . . 

50 

354 

349 

5 

55 

182 

178 

4 

60 

83 

79 

■  •  • 

4 

65 

26 

29 

3 

70 

7 

8 

1 

75 

1 

1 

... 

Totals 

6,657 

6,657 

70 

70 

*  Omitting  huiiclreds. 

t  Fonimla  representing  number  of  entrants  at  given  age  x  =  K(.r-18'59)i'*'''^ 
(88-48 -j-)''*'"^;  where  log  k=  -9-2360. 

X  The  student  may  consult  Woolhouse's  paper  on  "  The  Philosophy  of 
Statistics"  {J. I. A.,  vol.  xvii,  p.  37),  or  an  exhaustive  analysis  of  the  properties 
of  the  curve  hy  Mr.  Sheppard  {Phil.  Trans.,  vol.  192,  p.  lOl') ;  See  also  "  Bowlcy's 
Elements  of  Statistics  ",  Part  II,  Sec.  II. 


Here  the  numbers  also  exhibit  a  ■well-marked  law 
governing-  the  deviations  from  the  mean,  but  this  laAv  is  no 
longer  the  same  as  that  shown  by  the  "normal"  curve  of 
frequency.  The  maximum  ordinate  does  not  coincide 
either  with  the  averasre  ag'e  or  with  the  central  ao-e 
of  the  series;  while  the  number  of  cases  exceeding  the 
average  age  no  longer  equals  the  number  falling  short  of  it. 
Tn  other  words,  the  curve  is  non-symmetrical  or  skew.  It 
follows  very  approximately,  hoAvever,  a  certain  law,  as  will 
be  seen  by  comparing  the  numbers  in  column  (2)  with 
those  in  column  (3),  which  represent  the  computed  numbers 
according  to  the  formula  stated. 

Having  regard  to  the  fact  that  the  numbers  in  column  (2) 
represent  lOO's  and  not  units,  the  differences  between  the 
actual  and  computed  numbers  are  somewhat  outside  the 
probable  errors  of  observation.  There  are,  that  is  to  say, 
"systematic"  differences  between  the  two  curves.  These 
systematic  differences  are  generally  to  be  expected  in  dealing 
with  age  statistics.  It  will  be  seen  that  they  are  not 
incompatible  with  a  close  agreement  in  the  general  features 
of  the  two  curves,  but  they  serve  as  a  warning  that,  in 
statistics  of  this  nature,  formulae  representing  the 
distribution  of  deviations  from  the  mean  must  be  regarded 
as  approximations   only. 

If  we  consider  the  curves  exhibited  in  Tables  I  and  II  we 
see  that  the  general  chai-acter  of  such  curves  is  determined 
by  a  few  salient  features : 

1.  The  position  of  the  maximum  ordinate;    that  is,  the 

value  of  the  variable  having  maximum  frequency. 
This  value  is.  termed  the  mode. 

2.  The  average  or  mean  value  of  the  variable,  being  the 

arithmetical  mean  of  all  iiidividual  values.  In  a 
symmetrical  curve  this  coincides  with  the  "  mode." 

3.  The  average  deviation  from  the  mean,  corresponding 

to  the  closeness  with  which  the  individual  measures 
are  grouped  round  their  mean  value.  There  is  a 
certain  convenience,  for  analytical  reasons,  in 
adopting  as  our  standard  in  this  respect  either  the 
mean  of  the  sfpuires  of  the  individual  deviations,  or 
the  square  root  of  this  (juantity.  The  latter  is 
termed  the  standard  deviation.  We  may  represent 
the  average  of  the   squares  of  the  deviations,  or  the 
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mean  square"  deviation  by  the  symbol  /jL.2,  when 
the  standard  deviation  becomes  \/a2' 
4.  The  equality  or  otherwise  of  the  positive  and  negative 
deviations  from  the  mean ;  that  is,  the  symmetry  or 
shewness  of  the  curve.  The  sum  of  the  first  powers 
of  the  deviations  is,  of  course,  always  zero.  If  the 
curve  is  symmetrical,  the  sum  of  any  odd  power  of  the 
deviations  must  be  zero,  but  not  otherwise.  As  we 
have  employed  the  square  root  of  the  average 
square  of  the  deviations  as  a  measure  of  the 
diffuseness  or  spread  of  the  curve,  termed  the 
"  standard  deviation  ",  so  we  may  take  the  ratio  of 
the  cube  root  of  the  average  cube  deviation  to  the 
"  standard  deviation  "  as  the  standard  of 
"  skewness."  If  we  represent  the  average  cube 
deviation  by  the  symbol  fia,  the  skewnessoi  the  curve 

may  then  be  measured  by  ^ —  . 

The  skewness  is  sometimes  taken  as  the  difference  between 
the  "  mean "  and  the  "  mode ",  divided  by  the  standard 
deviation. 

The  sums  of  the  successive  powers  of  the  deviations 
of  the  variable  from  the  mean,  the  area  of  curve  being 
taken  as  unity,  are  termed   the  moments  of  the  curve. 

These  observed  laws  of  the  variation  of  measurements 
from  their  mean  are  very  general,  and  are  usually,  though  not 
invariably,  associated  with  what  is  termed  "  homogeneous " 
data.  The  distinction  between  "homogeneous  "  and  "hetero- 
geneous" data  is  of  considerable  importance,  although  not 
very  easy  to  define.  We  may  perhaps  define  a  homogeneous 
group  as  one  in  which  the  continuous  variations  are  from  a 
single  type  only,  and  are  unaffected  by  any  discontinuous 
variations  in  the  group  if  these  exist.  These  conditions  will 
hardly  ever  prevail, but  a  group  may  be  considered  for  practical 
pui'poses  as  homogeneous  if  the  variations  in  the  particular 
quality  dealt  with  are  not  materially  affected  by  any  discontinous 
variations  existing  in  the  group.  If,  however,  the  group  can 
be  split  up  into  two  or  three  distinct  series  differing  markedly 
in  certain  qualities,  and  these  differences  are  found,  or  ma}' 
reasonably  be  supposed,  to  affect  the  character  under 
examination,  then  the  series  is  "  heterogeneous." 

Take,  for  example,  the  class  representing  assured  lives  of 
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a  given  age,  but  of  varying  duration  of  assurance,  and  assume 
we  are  investigating  the  rate  of  mortality  of  the  class.  If  it  is 
found  on  examination  that  the  duration  of  assurance  materially 
affects  the  rate  of  mortality,  then  the  data  treated  as  a  whole 
is  heterogeneous.  If  it  is  found,  however,  that  the  duration 
of  assurance  after  reaching  a  certain  point  has  no  such 
iniluence,  or  an  influence  that  is  insignificant,  then  the  data 
from  this  point  and '  in  this  respect  may  be  treated  as 
homogeneous.  The  same  considerations  apply  to  distinctions 
in  class  of  assurance,  amount  of  policy,  occupation,  &c. 

The  laws  which  appear  to  govern  deviation  from  the 
average  in  homogeneous  data  are,  in  general,  so  uniform  in 
action  that  a  departure  therefrom  will  frequently  indicate 
that  data  which  might  be  supposed  to  be  homogeneous  are  not 
so.  An  interesting  illustration  of  this  may  be  seen  in  the 
case  of  the  Male  Annuitants  in  the  New  Offices'  Annuity 
Experience.  Consider  the  following  table  showing  the  number 
of  entrants  for  various  groups  of  ages  : — 


Table  III. 

Male  Annuitants  0«™  Data. 
Number  of  entrants  at  various  ages,  18G3-1893. 


Ages 

at  Entry 

X 

(1) 
33-37 
38-42 
43-47 
48-52 

Entrants 

Computed 
Numbers 

/j:-65\2 

Observed  —  Computed 

+                        — 

73 
119 
207 
421 

(3) 

5 
21 

89 
266 

(4)                         (5) 

68 

93 
118 
155 

53-57 

58-02 
63-67 
68-72 
73-77 

78-82 
83-87 
88-92 
93-98 

599 
957 

1,147 
982 
660 

252 
72 
15 

1 

587 

95 1 

1,H2 

1,007 

655 

313 

109 

29 

6 

12 
3 
5 

25 
5 

61 

37 

14 

6 

These  particuhir  agt'  giniips  are  selected  as  tlioro  appears 
to  be  a  slig-lit  excess  in  the  nunihcr  nf  entrants  at  decennial 
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and  quinquennial  ages,  and  by  placing  these  in  the  middle 
of  the  groups  we  get  rid  of  the  disturbance,  which  would 
otherwise  affect  the  numbers. 

An  examination  of  the  numbers  in  column  (2),  between 
ages  53  and  78,  shows  that  they  form  a  nearly  symmetrical 
curve,  as  is  seen  by  a  comparison  with  a  "normal"  curve  of 
frequency  given  in  column  (3).*  The  numbers  above  age  78, 
however,  are  in  defect,  and  those  below  53  are  considerably  in 
excess  of  the  figures  suggested  by  the  normal  curve.  As 
regards  the  falling-off  of  the  numbers  at  the  older  ages,  it 
may  be  conjectured  that  it  is  in  part  due  to  the  fact  that  many 
published  tables  of  the  cost  of  annuities  cease  at  age  75  or 
80.  The  observed  excess  in  the  number  of  entrants  at  ages 
below  50  evidently  represents  the  entrance  at  these  ages  of  a 
class  of  lives  differing  from  those  forming  the  bulk  of  the 
data.  It  may  perhaps  be  conjectured  that  a  number  of  these 
cases  are  counter  lives  in  contingent  reversions,  or  similar 
securities,  upon  whose  lives  annuities  have  been  purchased  to 
secure  the  payment  of  annual  premiums.  Be  that  as  it  may, 
Ave  find  that  while  the  deficiency  of  entrants  at  the  older 
ages  does  not  appear  to  affect  the  mortality  rates,,  the  entrants 
at  the  younger  ages  on  the  contrary  show  abnormally  heavy 
mortality,  the  ungraduated  values  of  the  expectation  of  life 
for  entrants  under  age  55  being  relatively  Ioav.  Hence  we 
may  calculate  that  the  male  annuitant  experience  is  hetero- 
geneous, and  in  using  the  results  as  a  basis  of  calculation 
for  the  future,  the  abnormal  part  of  the  experience  representing 
the  entrants  at  the  younger  ages  was  propei'ly  rejected. 

In  addition  to  tables  of  the  kind  Ave  have  been  considering, 
a  statistical  table  may  be  a  numerical  statement  of  the 
manner  in  Avhich  variation  in  one  particular  from  the  average 
of  the  group  is  accompanied  by  variation  in  some  other 
particular.  We  may,  for  instance,  have  a  table  representing 
a  number  of  individuals,  arranged  according  to  height,  the 
numbers  at  each  height  being  further  arranged  according  to 
weight.  We  should  then  have  a  table  of  double  entry,  each 
roAV  or  column  of  Avhich  Avould  represent  a  statistical  table  of 
the  form  already  considered.  By  means  of  this  table  we  should 
be  able  to  "  correlate  ",  as  it  is  termed,  variations  in  respect  to 

•■■■  The  constants  of  this  curve  were  only  roughly  determined,  but  the 
agreement  with  the  observed  numbers  between  ages  53  and  78  is  sufficiently 
close  to  illustrate  the  point  under  discussion. 
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Aveight  witli  variations  in  respect  to  height.  Such  a  table 
would  represent  a  mass  of  figures,  the  bearing  of  which  could 
not  easily  be  grasped  without  some  further  analysis.  If, 
however,  we  add  to  the  table  a  column  showing  the  average 
weight  for  persons  of  a  given  height,  we  then  have  a  ready 
means  of  seeing  how  this  average  weight  is  affected  by  a 
change  in  height.  Having  inserted  the  average,  Ave  have  not 
exhausted  the  information  which  the  original  figures  give  us. 
We  need  also  to  know  to  what  extent  on  the  average  the 
weight  varies  when  the  height  remains  constant ;  that  is,  we 
need  to  insert  against  each  average  Aveight  AA'hat  Ave  have 
termed  the  "  standard  deviation." 

A  familiar  example  of  such  a  table  is  one  shoAving  the 
asres  of  husbands  and  wives  at  marriage.  Such  a  table  would 
take  the  following  form — 

Table  IV. 
Showing  Ages  of  Husbands  and  Wives  at  date  of  Marriage. 


Husbands' 

AVivEs'  Ages 

Ages 

under 

Mean  Ages 

20-30 

30-40 

40-50 

50-60 

60-70 

of 

20 

AV'ives 

under  20 

13 

5 

17-8 

20-30 

215 

500 

16 

1 

22-3 

30-40 

14 

107 

39 

4 

. . . 

27-0 

40-50 

1 

14 

23 

12 

2 

350 

50-60 

2 

6 

9 

4 

42-1 

60-70 

1 

3 

4 

» 

520 

70-80 

... 

1 

1 

1 

55-0 

Mean 

Ages  of 

251 

27-2 

37-6 

49-0 

58-6 

68-3 

— 

Husbands 

■ 

If  there  were  no  correlation  between  the  ages  of  the 
husbands  and  Avives  at  marriage,  the  figures  shoAving  the 
average  ages  for  the  various  colmmis  Avould  (except  for 
accidental  fluctuations)  be  identical,  and  the  same  Avould  hold 
for  the  average  ages  of  the  successive  roAVS. 

If  a  line  were  drawn  through  the  table  cutting  those 
points  in  the  rows  corresponding  to  the  average  ages,  and 
another  lino  similarly  cutting  those  points  in  the  columnn 
representing  average  ages,  it  AVould  bo  found  that  these 
points  could  roughly  bo  represented  l)y  straight  lines,  Avliich 
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in  the  present  example  would  be  nearly  coincident,  since  the 
spread  of  the  figures,  as  measured  by  their  standard  deviation, 
is  very  similar  in  both  rows  and  columns. 

It  is  not  always  the  case,  however,  that  the  nature  of  the 
correlation  can  be  represented  by  a  straight  line.  In  the 
following  example  we  have  a  somewhat  different  class  of 
table  showing  the  proportions  for  different  age  groups  of 
wives  and  widows  in  an  Indian  pension  fund. 

Table   IVa. 
Showing  proportion  of  Wives  and  Widows  in  a  Pension  Fund. 


Ages 

Number  of 
Wives 

Number  of 
Widows 

Total 

Widows. 

per-cent  of 

Total 

under  20 
20-30 
30-40 
40-50 
50-60 
60-70 
70-80 
80-90 

19 

1,430 

3,366 

3,329 

1,653 

476 

63 

6 

"so 

355 

1,018 

1,312 

933 

330 

46 

19 

1,480 
3,721 
4,347 
2,965 
1,409 
393 
52 

00 
3-4 
9-5 
23-4 
44-2 
66-2 
84-0 
88-5 

Here  it  will  be  seen,  from  the  run  of  the  figures  in  the  last 
column,  that  they  cannot  be  well  represented  by  a  straight 

line,  being  somewhat  in  the  form  of  the  curve  of  J  e  ~    ''^'  dx, 
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or  of  the  curve ,  witli  values  of  0  and  1  respectivelv  at 

the  limits. 

Such  a  table  of  correlation  has  an  analogy  with  the  table 
of  the  "  Exposed  to  Risk "  and  "  Died ",  which  ordinarily 
forms  the  basis  of  our  Mortality  Tables.  This  table  is 
virtually  in  the  following  form — column  (4)  representing  the 
number  of  annual  survivors  being  usually  omitted  as  being 
implicitly  contained  in  columns  (2)  and  (3)  — 

Table  of  Exposed  to  Risk  and  Died. 


Age 

Exposed  to  Risk 

Died 

Survived 

(]) 

(2) 

(3) 

(4) 
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We  have  here  the  ages  of  the  persons  observed;  the 
numbers  under  observation,  or  "  Exposed  to  Eisk  ",  Avhich, 
for  the  sake  of  simplicitj^,  Ave  will  suppose  to  remain  under 
observation  for  the  entire  year  of  age;  the  number  of  those  Avho 
die  during  the  year,  and  of  those  surviving.  If  we  represent 
the  rate  of  mortality  by  q^;  then  in  all  cases  in  column  (3) 
gt^  =  l,  and  in  all  cases  in  column  (4)  qj^  =  0,  and  we  have  a 
table  which  is  analogous  to  the  table  of  the  weights  of 
indi\aduals  of  respective  heights,  only  that  instead  of  having 
various  values  of  qjc,  we  have  in  the  nature  of  things  only 
two  possible  values  0  and  1,  the  average  value  for  each  group 
representing  the  observed  "rate  of  mortality."  This  table 
differs  from  that  correlating  weights  and  heights,  or  ages  of 
husbands  or  waves  at  marriage,  agreeing  with  that  correlating 
age  and  civil  condition,  in  the  fact  that  a  certain  quality 
or  characteristic,  in  this  case  death  during  a  given  year 
of  age,  is  not  present  in  varying  proportions,  but  is 
either  present  or  entirely  absent.  We  are  thus  introduced 
to  the  conception  of  probability,  the  proportion  of  any 
group  surviving  or  dying  representing  the  "  probability " 
of  survival  or  death  for  any  individual  of  the  group  taken 
at  random.  The  idea  of  probability  is  also  present  in 
the  supposed  table  of  weights,  although  not  so  obviously. 
That  table  would  inform  us,  for  example,  of  the  probability 
of  a  person  of  given  height  exceeding  or  falling  short  of 
a  certain  fixed  standard  weight,  and  Ave  should  then  have 
a  table  identical  in  form  Avith  the  table  of  Exposed  to 
Risk  and  Died. 

This  conception  of  probability  is  important  to  the  Actuary, 
because  his  object  in  collecting  statistics  is  the  distinctly 
practical  one  of  measuring  the  probability  of  the  happening 
of  certain  contingencies.  It  is  necessary  to  realise  clearly 
what  is  meant  by  the  statement  that  the  probability  of  a 
particular  event  has  this  or  that  value.  Laplace  pointed 
out  that  Avhcn  Ave  speak  of  the  probability  of  the  happening 
of  a  given  event,  Ave  do  so  only  on  account  of  our  ignorance 
of  the  antecedents  of  the  event,  or  our  inability  to  completely 
analyze  them.  If  avc  entirely  kncAv  the  antecedents,  and  if 
our  poAvers  of  analysis  Avere  equal  to  the  task,  avo  could 
predict  the  event.  In  many  cases  Ave  are  able  to  do  tins 
approximately,  but  Avhere  the  effective  causes  at  Avork 
are    numerous    and    obscure,    and    the    result    in    individual 
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(apparently  similar)  cases  is  very  variable,  as  in  all  questions 
affecting   life  contingencies,  we    are  unable  to  forecast  the 
event  in  a  given  case,  and  must  fall  back  upon  the  average 
result  deduced  from  the  examination  of  a  large  number  of 
similar  cases.     In  other  words,  we  treat  the  particular  case  in 
question  as  one  of  an  indefinitely  large  class  of  similar  cases, 
a  sample  of  which  we  have  already  had  under  examination. 
From  the  results  of  such  examination  we  infer  the  composition 
of    the    class    as  a  whole,   and  hence  the  "  probability "  or 
average    event   in    an   individual    case.      If,    in   the    sample 
observed,  a  given  character  is  present  in  a  certain  proportion 
of  cases,  as  for  instance,  where  out  of  a  number  of  persons  of 
given  age  under  observation,  a  certain  proportion  have  died 
within  the  year  of  age,  then  we  estimate"  the  probability  of 
the  event  happening  in   a  particular  instance,  by  the  ratio 
which  the  number  of  cases  in  which  the  event  has  occurred 
bears  to  the  entire  number  of  cases  observed.^     To  determine 
the  probability  of  a  given   event  is  therefore  to  assign  the 
case  to   the  natural  group   or    series   to   which   it   properly 
belongs  and  to  pass  under  examination  a  sample  of  the  group 
sufficiently  large  to  enable  us  to  determine  approximately  the 
average   character   of   the  whole  as   regards    the  particular 
quality  in  question.     We  are  here  speaking  of  simple  events ; 
the  probability  of  a  complex  event,  such  as  the  survival  of 
one  life  by  another,  is,  of  course,  not  determined  directly  by 
past  observations.     The  latter  yield  the  simple  probabilities 
of  surviving  each  year  of  age,  by  suitably  combining  which 
we  arrive  at  the  value  of  the  probability  desired. 

The  degree  of  certainty  with  which  we  can  deduce  the 
properties  of  an  entire  class  from  the  part  known  to  us, 
depends  first  on  our  assurance  that  the  class  is  homogeneous, 
or  at  least  that  the  portion  observed  is  representative,  such 
as  would  result  from  a  selection  of  cases  made  at  random,  and 
secondly,    on   the    number   of    cases  that    have   been    under 

"   The   formula   deduced   by  Laplace   by  wliicli  the  true  probability  of  an 
event  which  has  been  observed  to  happen  m  times  out  of  w  +  «  trials  is  taken  as 

—^ j;  is  obviously  not  aj^plicable  to  such  a  function  as  the  rate  of  mortality, 

m  +  n  +  ii 

nor  to  any  analogous  function.     It  is  sufficient  to  consider  that  in  tabulating  the 

values  of  the  probability  of  dying  in  each  year  of  age,  we  are  using  an  arbitrary 

unit  of  time  which  might  just  as  well  be  a  month  or  day,  in  which  cases  we  should, 

by  use  of  the  above  formula,  produce  quite  different  mortality  tables  from  the 

same  data. 
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observation.  If  we  examine  the  figures  in  tables  similar 
to  Tables  I  and  II,  we  see  that,  in  proportion  as  the  number 
of  cases  under  observation  is  small,  the  figures  I'epresentino* 
the  results  of  the  experience  are  irregular,  while,  on  the  other 
hand,  Avhere  the  number  of  facts  observed  is  verv  laro-e,  the 
irregularities  become  relatively  less.  AVe  arrive  at  the  same 
conclusion  from  theory.  If  an  indefinitely  large  group  N" 
contains  Np  objects  of  class  A  and  N(l— j))  objects  not  of 
class  A,  and  if  from  the  group  n  objects  are  selected  at 
random,  then  on  the  average  ii'p  of  these  will  be  of  class  A. 
If  we  represent  the  observed  number  in  any  given  case  as 
nip  +  z,  the  average  algebraical  value  of  z  will  be  zero, 
Avhile  its  average  numerical  value,  irrespective  of  sign,  will 
be  very  nearly  Y^'/;ip(l— _p).*  This  latter  quantity  clearly 
increases  as  nio  increases,  but  at  the  same  time  its  ratio  to 
np  diminishes.  Thus  in  a  table  of  exposed  to  risk  and  died 
the  actual  irregularities  in  the  number  of  deaths  increase 
with  the  magnitude  of  the  experience,  but  the  irregularities 
in  the  rate  of  mortality  diminish.  Hence  from  theory  as  from 
exi^erience  we  derive  the  conviction  that  if  instead  of  the 
limited  number  of  facts  which  Ave  have  been  able  to  examine, 
we  could  have  examined  an  indefinitely  large  number  of 
similar  facts,  the  results  would  have  been  relatively  free 
from  irregularity,  and  capable  of  being  expressed  by  a 
continuous  curve  ;  without,  of  course,  being  sure  that  any 
such  curve  could  be  expressed  algebraically. 

The  idea  underlying  the  graduation  of  the  figures  of  a 
statistical  table,  Avhatever  be  the  process  employed,  is  that  a 
continuous  curve  may  be  found  representing  the  general  trend 
of  the  observations  freed  from  irregularities  due  to  paucity 
of  material.  This  curve,  we  have  reason  to  believe,  Avill 
correspond  more  closely  than  the  ungraduated  curve  to  the 
results  obtainable  from  a  much  larger  body  of  facts.  This  is 
the  rationale  of  the  process  of  graduation  and  its  justification. 
Such  a  process  cannot  deal  Avith  systematic  errors  affecting 
the  table  as  a  Avhole  and  cannot  compensate  for  inadequate 
data.  It  adds  Aveight  to  the  results,  hoAvever,  at  each 
individual  point  of  the  table,  and  assists  in  bringing  into 
relief  the  true  character  of  tlie  curA'c  by  freeing  it,  in  a 
large  measure,  from  accidental  irregularities. 

•  The   average  value   of  2"  will    be  n}q,  the    average   value  of    z^  will  bo 
"P?''p-3),and  the  aA-erage  A-alue  of  2'  will  be  Jip(/[(,3n  — 6;/>j  + 1].  See  Note  A,p.  110. 
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There  may  be  other  objects  aimed  at  in  a  graduation 
besides  that  of  removing  the  irregularities  from  the  rough 
figures,  with  the  view  of  bringing  out  more  clearly  the  law 
underlying  them.  The  Actuary  constructs  tables  not  merely 
to  show  what  has  happened  in  the  past,  but  to  enable  him  to 
forecast  the  future,  and  as  he  requires  these  tables  as  a  basis 
for  financial  operations,  considerations  are  introduced  which 
do  hot  arise  in  the  treatment  of  purely  statistical  tables. 
Whatever  class  of  events  the  Actuary  may  have  to  deal  with, 
will  be  subject  to  change  with  the  lapse  of  time.  That 
portion  of  the  class  he  has  been  able  to  observe  lies 
necessarily  in  the  past ;  the  conclusions  he  has  derived  from 
their  study  he  proposes  to  extend  to  the  future.  He  must 
therefore  consider  how  far  the  observed  characters  of  the 
class  are  changing  or  permanent,  and  must  endeavour 
to  distinguish  between  changes  representing  permanent 
tendencies  and  those  due  merely  to  temporary  fluctuations. 
In  the  selection  of  data  suitable  for  his  purpose  the  Actuary 
will  aim  on  the  one  hand  at  a  sufficiently  broad  basis  both  in 
space  and  time  to  eliminate  the  eifects  of  local  and  temporary 
fluctuations,  and  on  the  other  hand  he  will  aim  at  obtaining 
as  far  as  possible  a  homogeneous  group  of  data.  These  two 
aims  are  more  or  less  in  conflict,  and  he  will  lean  to  the  one 
side  or  the  other,  according  to  the  object  he  has  in  view. 
Where,  for  example,  that  object  is  to  produce  a  table  that 
may  be  adopted  as  a  general  standard  by  various  institutions, 
often  difl^ering  considerably  as  to  their  individual  experience, 
he  must  aim  at  a  correspondingly  broad  foundation.  In 
these  circumstances  it  will  not  generally  be  possible  to  obtain 
a  really  homogeneous  experience.  If  it  is  a  question  of  the 
mortality  of  assured  lives,  for  instance,  this  will  be  found  to 
be  affected  by  endless  individual  variations,  age,  sex,  duration 
of  assurance,  occupation,  civil  condition,  class  of  assurance, 
character  of  the  insuring  office,  &c.,  &c.,  and  from  such 
material  approximately  homogeneous  data  could  only  be 
obtained  by  cutting  up  the  experience  into  comparatively 
small  groups  and  thus  sacrificing  all  generality.  This  can  be 
avoided  in  practice  by  first  excluding  all  extreme  variations. 
The  sexes  will  be  separately  treated,  lives  so  impaired  as  to 
prospects  of  longevity  by  personal  health,  family  history, 
occupation,  or  residence  in  unhealthy  districts  as  to  be  "  rated 
up"  will  be  excluded,  as  also  classes  of  assurance  that  may 
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be  supposed  subject  to  rates  of  mortality  differing  from  the 
average.  AVhen  the  data  has  thus  been  trimmed  of  the 
extreme  variations,  a  body  of  experience  will  generally 
remain  not  greatly  shrunken  from  its  original  dimensions 
and  in  which  the  discontinuous  variations  are  sufficiently 
numerous  and  individually  unimportant  to  render  the  data 
for  practical  purposes  homogeneous.  The  rates  of  mortality, 
or  of  withdrawal,  can  then  be  treated  as  functions  of  the  two 
remaining  variables  of  importance,  the  age  and  the  time 
elapsed  from  date  of  entry ;  or  as  functions  of  the  age  only 
from  the  point  at  which  the  factor  of  duration  may  be  found 
to  be  unimportant. 

On  the  other  hand,  the  Actuary's  object  may  be  precision 
rather  than  generality ;  he  may  have  to  deal  with  a  group, 
subject  to  special  conditions  and  presenting  special 
characteristics,  as  is  usual  in  the  case  of  pension  funds  and 
friendly  societies.  Here,  if  the  data  are  at  all  adequate,  better 
results  will  be  obtained  therefrom  than  by  having  recourse  to 
any  general  experience.  Where  it  is  insufficient  by  itself  as 
a  basis  for  statistical  tables  it  may  serve  as  an  indication  as 
to  what  standard  table  is  the  most  suitable  to  employ  and  as 
to  how  far  and  in  what  direction  it  may  be  desirable  to 
introduce  any  modifications  therein.  In  an  experience  of 
this  character  the  data  may  sometimes  be  very  heterogeneous, 
but  there  is  usually  the  safeguard  that  its  composition  is 
approximately  constant. 

A  question  of  some  importance  may  here  be  considered, 
namely,  the  relative  claims  of  lives,  policies,  or  amounts 
assured  to  form  the  basis  of  the  mortality  table.  In  the 
17  Offices'  data,  the  number  of  policies,  in  the  H^  and  0*^ 
data,  the  number  of  lives  passing  under  observation 
constitute  the  basis  of  the  experience,  while  in  the  American 
Offices'  Experience  (1880)  the  sum  assured  was  the  unit.  In 
the  instances  of  the  H^^  and  0^'  Tables,  wherever  a  life  would 
have  been  doubly  observed  the  duplicate  assurance  was 
eliminated.  In  justification  of  the  use  of  the  sums  assured 
as  the  basis  of  the  experience,  in  lieu  of  the  number  of  lives, 
it  may  be  said  that  in  this  way  wo  represent  the  financial 
effect  of  the  mortality,  as  it  makes  no  difference  to  the 
insuring  company  whether  one  claim  arises  for  £10,000  or 
one  hundred  claims  for  £100  each.  There  are,  however,  serious 
objections  to    employing  the  sums  assured  as  a   basis   for   a 
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mortality  table,  based  upon  a  general  experience.  Either  the 
mortality  among  the  lives  carrying  large  sums  assured  is 
similar  to  the  average  or  it  is  not.  If  it  is  similar,  the 
general  character  of  the  table  will  not  be  affected  by  the 
additional  weight  given  to  these  lives  in  the  experience,  but 
the  irregularities  in  the  deduced  rates  of  mortality  will  be 
considerably  increased.  The  result,  indeed,  will  be  virtually 
the  same  as  if  we  had  used  a  part  only  of  the 
available  data,  selected  at  random,  instead  of  the 
whole.  If,  on  the  other  hand,  the  mortality  among  the 
lives  insured  for  large  sums  is  materially  different  from 
the  average,  then  the  experience  is  not  homogeneous. 
As  a  matter  of  fact,  these  lives  of  themselves  do  not  form  a 
homogeneous  group.  In  certain  societies  they  appear  to  give 
better  rates  of  mortality  than  the  average ;  in  others,  where 
they  are  mainly  represented  by  non-profit  policies  effected  for 
commercial  reasons,  they  are  no  doubt  subject  to  higher  rates 
of  mortality  than  the  average.  As  in  a  general  experience, 
combining  the  individual  experience  of  many  offices,  these 
lives  will  represent  an  exceptional  or  abnormal  element, 
which  may  or  may  not  persist  in  the  future,  and  will  certainly 
not  persist  equally  in  all  societies,  it  is  not  desirable  in 
deducing  a  general  mortality  table  to  specially  "  weight  up  " 
this  part  of  the  data. 

The  same  considerations  apply,  but  with  somewhat 
less  force,  to  the  plan  of  making  policies  rather  than 
lives  the  basis  of  an  experience.  Without  dogmatizing 
upon  the  point,  it  appears  to  me  that  the  proper  course  is, 
where  two  or  more  policies  are  effected  at  the  same  time  or 
at  the  same  age  at  entry,  to  treat  them  as  a  single  risk, 
but  where  the  subsequent  policies  are  effected  at  later  ages, 
involving  fresh  medical  selection,  to  treat  them  as  separate 
risks.  This  means  the  elimination  of  duplicates  in  each  of 
the  "  select ''  tables  for  individual  ages  at  entry,  but  no 
further  elimination  in  the  resulting  aggregate  tables,  a  course 
which  has  the  advantage  of  making  the  aggregate  table  the 
true  aggregate  of  the  tables  for  separate  ages  at  entry. 
Judging  by  the  results  of  the  0'^  experience,  this  course  is 
necessary  if  we  are  to  produce  an  aggregate  table, 
representing  "  ultimate  "  rates  of  mortality  after  the  lapse  of 
a  stated  period  from  entry,  which  will  join  on  smoothly  to 
the  "  select "  rates. 
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A  detail  of  less  importance,  but  of  considerable  interest, 
is  the  question  of  the  proper  treatment  of  withdrawals  in 
a,  mortality  experience.  These  are  usually  treated  as 
withdrawing  upon  the  termination  of  the  days  of  grace  in 
case  of  lapse  by  non-payment  of  premium,  and  for  the 
purpose  of  obtaining  the  true  measure  of  the  moi-tality 
■experienced  this  course  is  the  correct  one.  It  should  be 
borne  in  mind,  however,  that  to  arrive  at  the  finaucial  effect 
of  the  mortality  the  numbers  of  the  exposed  to  risk  should 
correspond  to  the  number  of  annual  premiums  paid,  and  from 
this  point  of  view  the  life  Avithdrawing  should  not  be  treated 
at  risk  during  the  days  of  grace.  The  differences  in  the 
resulting  mortality  rates  according  to  the  two  methods  is,  of 
.course,  very  slight. 


SECOND  LECTURE. 


JlLaVING  dealt  in  the  last  lecture  with  the  rationale 
of  graduation  in  general,  I  now  propose  to  refer  more 
particularly  to  the  principles  underlying  certain  special 
methods  of  graduation.  We  may  divide  the  various  methods 
which  are  in  use  into  three  classes : 

1.  Graphic  methods. 

2.  Methods     based     upon    Intei-polation     or     Finite 

Difference  formulae,  such  as  Mr.  Woolhouse's. 

3.  Methods  which  depend  upon  the  use  of  Frequency 

Curves,  in  which    Ave    may   include    all    methods 

based  upon  the  assumption  that  the  series  to  be 

graduated  can  be  represented  as  some  function  of 

the  variable. 

Certain  general  considerations  apply  to  all  these  methods. 

We  may  have  to  deal  either  with  a  single  series  of  numbers, 

such  as  the  number,  at   successive    ages,   of   lives   effecting 

assurances,   of    persons  enumerated  at   a  census,   or  attacks 

from    a    given    disease,   &c. ;    or,  as  more  often  happens  in 

actuarial  statistics,  the  fact  of  importance  may  be  the  ratio 

between  the  corresponding  members  of  two  series    of  numbers 

as  in  a  table  of  "  Exposed  to  Risk"  and  "Died^^  forming  the 

basis  of  the  Mortality  Table,  where  the  fact  sought  is  the  rate 

of  mortality  at  each  age  given  by  the  ratio  of  the  Died  to 

the  Exposed   to    Risk,    the   actual  numbers    of   these  being 

of     importance     mainly     as     affording    a    measure    of    the 

trustworthiness  of  the  deduced  ratio. 

Where  only  a  single  series  of  numbers  is  involved,  the 
problem  is  comparatively  simple,  and  an  accurate  solution  is 
not  generally  of  great  importance  to  the  actuary.  In  the 
more  usual  case  where  the  ratio  of  the  corresponding 
members  of  two  series  of  numbers  is  in  question,  the  problem 
is  more  complicated.  We  have  a  choice  of  procedure  :  we 
may  either  graduate  independently  the  two  series  of  numbers 
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(in  the  case  supposed  the  numbers  of  the  "  Exposed  to 
Risk"  at  each  age  and  the  numbers  of  the  "Died"),  or, 
disregarding  the  irregularities  in  the  two  series,  we  may 
proceed  to  deal  at  once  with  the  ratios  only.  If  each  series 
can  be  satisfactorily  graduated,  the  resulting  curves  being 
smooth  and  fitting  the  ungraduated  sei-ies  sufficiently  closely — 
that  is  to  say,  within  the  limits  of  the  errors  of  observation — 
we  may  then  assume  that  the  ratios  of  the  corresponding 
terms  (in  the  case  supposed  the  rates  of  mortality)  will  also 
be  within  the  limits  of  error.  It  may  also  be  said  that  by 
working  with  the  rough  facts  themselves,  rather  than  the 
ratios  between  the  two,  we  keep  in  view  the  weight  of  the 
observations  at  each  point  of  the  curve,  and  are  able  to  see 
at  once  how  far  our  graduated  numbers  vary  from  the 
original,  and  how  far  that  variation  is  justified  by  the  number 
of  facts  at  each  particular  point.  There  are,  however,  some 
important  objections  to  this  course.  In  the  first  place,  the 
ratio  between  the  corresponding  terms  in  the  two  series  of 
numbers  represents  generally  a  relatively  stable  quantity, 
whereas  the  actual  numbers  in  either  series,  depending  as 
they  do  upon  the  extent  of  the  experience  under  review  at 
particular  ages,  are  liable  to  fluctuations  of  a  more  or  less 
arbiti-ary  character.  Further,  supposing  the  graphic  method 
of  graduation  or  the  method  of  finite  differences  is  employed — > 
in  either  case  the  argument  is  applicable,  although  specially 
so  in  the  former — it  will  be  found  that  each  curve  will 
contain  certain  outstanding  irregularities,  as  it  is  not  possible 
entirely  to  remove  all  irregularities  by  those  methods.  Hence 
in  the  adjusted  ratios  two  sets  of  irregularities  will  be  super- 
imposed and  a  less  satisfactory  series  of  values  obtained  than 
if  the  ratios  themselves  had  been  dealt  with. 

A  stronger  objection,  when  dealing  with  a  mortality 
experience,  to  graduating  separately  the  numbers  in  the  two 
.series  of  "  Exposed  to  Kisk"  and  "Died"  rather  than  their 
ratio,  is  that  we  thereby  discard  our  previous  knowledge  of 
the  nature  of  tlie  curve  expressing  that  ratio — our  general 
knowledge,  that  is,  of  the  nature  of  the  curve  qjc  or  /i^ — 
knowledge  wh  idi  is  of  considerable  assistance  in  graduating  the 
commencement  and  end  of  the  tablf  uIk  re;  the  data  are  few. 

Wliere  a  graduation  of  both  series  of  iiiiiiil)ers  is  made,  it 
is  preferable,  indeed  necessary  if  tin-  best  results  are  to  be 
obtained,  lift  (T   fii'st    gradual  Iiil;;    tlic    scries   cni-r('S])nnding  to 
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the  "  Exposed  to  Risk"^  to  re-compute  the  numbers  of  deaths, 
lapses  or  marriages,  as  the  case  may  be,  on  the  basis  of  the 
graduated  numbers  of  the  Exposures,  and  to  operate  upon 
these  adjusted  numbers.  We  are  in  this  way  less  Hkely  to 
obscure  the  law  of  the  series  representing  the  required  ratios. 
Notwithstanding  any  theoretical  objections,  there  may  be 
occasions  on  which  it  is  more  convenient,  or  even  necessary, 
to  deal  with  the  two  series  separately ;  where,  for  example, 
as  in  the  Registrar-General's  returns  of  the  population  and 
deaths  for  certain  occupations,  we  have  not  the  facts  for 
individual  ages,  but  only  in  certain  large  groups.  The 
ratio  of  deaths  to  exposures  for  each  age  group  are  obviously 
not  satisfactory  approximations  to  the  rate  of  mortality  for 
the  central  age  of  the  group.  In  these  circumstances  it 
appears  to  be  best  to  adopt  a  plan  similar  in  principle, 
though  not  in  detail,  to  that  employed  by  Milne  in  graduating 
the  Carlisle  Table,  and  to  draw  curves  respectively  through 
the  parallelograms  representing  the  exposures  and  the  deaths, 
and  from  these  deduce  the  numbers  for  individual  ao-es.  The 
graphic  method,  however,  is  not  very  suitable  for  this  purpose, 
and  the  use  of  interpolation  formulee  does  not  always  give 
good  results.  It  is  generally  better  to  make  use  of  suitable 
frequency  curves.  It  will  be  seen  later  that,  where  the 
number  of  groups  is  rather  small,  the  use  of  the  normal 
frequency  curve,  with  certain  modifications,  enables  us  to 
re-distribute  the  numbers  representing  the  groups  of 
"  Exposed  "  and  "  Died  ",  and  so  obtain  graduated  numbers 
for  each  age,  and  hence  from  the  ratios  of  these  a  graduated 
rate  of  moi-tality.      {8ee  the  Sixth  Lecture,  p.  91.) 


We  shall  now  assume  that  we  are  dealing,  not  with  the 
two  independent  series,  but  with  the  ratio  between  the  two ; 
as,  for  example,  Avith  q^,  or  some  analogous  function. 

We  may  consider  we  have  three  independent  estimates  of 
the  value  of  g^ : — 

1st — That    derived    from   the    observed   ratio    of   the 

died  to  the  exposed  at  age  x. 
2nd — That    derived   from   the    data   at    neighbouring 

ag'es. 
3rd — That  derived  from  previous  experience  of  more 
or  less  similar  data. 
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The  first  and  second  should  be  suitably  combined  in  the 
process  of  graduation.  The  last  is,  in  the  nature  of  things, 
a  very  vague  estimate,  and  bears  a  relation  to  that  derived 
directly  from  the  observations,  if  these  are  numerous,  similar 
to  that  of  a  rough  measurement  by  inferior  instrumental 
means  to  one  made  by  an  instrument  of  precision.  In  such 
case  no  weig-ht  attaches  to  it. 

There  are  circumstances,  however,  in  which  the  a  'priori 
estimate  of  the  values  of  qx  become  important,  viz.,  when  the 
observations  at  our  disposal  are  extremely  few.  As  the 
exteut  of  our  observations  diminish,  the  numbers  of  exposures 
and  deaths  becoming  smaller,  the  weight  to  be  attached  to 
the  deduced  values  of  the  rate  of  mortality  become  less,  and 
a  point  is  eventually  arrived  at  when  we  obtain  more 
trustworthy  results  by  considering  to  Avhat  particular  class 
of  examined  data  the  experience  most  nearly  conforms  in 
character,  and  falling  back  upon  the  results  of  such  related 
experience. 

If  we  have  to  deal  with  a  large  experience,  a  somewhat 
similar  difficulty  arises  at  the  commencement  and  end  of  the 
table.  Generally  speaking,  we  then  derive  more  trustworthy 
values  for  the  rates  at  these  ages  from  a  consideration  of  the 
general  trend  of  the  curve  and  our  previous  approximate 
knowledge  of  its  character,  than  by  falling  back  upon  any 
related  experience. 


Coming  to  the  principles  underlying  each  of  these  three 
methods  of  graduation,  we  consider  first  the  graphic  method, 
whether  in  the  form  employed  by  Milne  or  in  the  preferable 
form  employed  by  Dr.  Sprague.  This  method  makes  no- 
furtlier  assumption  than  that  the  series  with  Avhich  we 
are  dealing  would,  if  the  observations  were  sufficiently 
extensive,  form  a  continuous  and  regular  curve,  and  that  the 
irregularities  actually  occurring  in  the  ungraduated  values 
are  due  to  tlie  smallness  of  the  data. 

To  Dr.  Sprague  {J. I. A.,  vol.  xxvi,  p.  77)  we  owe  the 
most  systematic  and  satisfactory  exposition  of  the  graphic 
method.  An  essential  feature  in  his  procedure  is  the 
preliminary  division  of  llic  «I:ita  (which  we  may  suppose 
arranged  by  years  of  age)  into  groups,  so  selected  as  to  aii'ord 
a   steady  progression    in    the  average  rates  of  mortality  for 
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successive  groups,  due  regard  being  had  to  the  range  of 
these  groups.  For  examples  of  the  method,  the  student  must 
be  referred  to  Dr.  Sprague's  original  papers.  This  process  of 
dividing  the  data  into  selected  groups  appears  at  first  sight  to  be 
arbitrary,  but  it  may  be  justified  on  the  grounds  : — (1)  That 
in  a  series  of  observations  such  as  we  are  discussing,  where 
at  each  age  the  results  are  affected  by  irregularities  or  errors 
of  observation,  a  successful  graduation  will  reduce  the  sum 
of  these  errors  and  also  the  sum  of  the  "  accumulated  "  errors 
to  zero,  or  nearly  so.  Hence  if  we  compute  at  each  age  the 
accumulated  errors  (reckoning  from  either  end  of  the  series) 
these  must,  in  order  that  their  sum  may  be  approximately 
zero,  change  sign,  thus  passing  through  zero,  fairly  frequently. 
The  data  will,  therefore,  be  made  up  of  consecutive  groups, 
larger  or  smaller,  in  each  of  which  there  is  an  approximate 
balance  of  errors,  and  it  may  be  assumed  that,  with  a  sufficient 
amount  of  experience  and  the  exercise  of  some  trouble,  these 
groups  can  be  found  by  inspection  and  trial.  (2)  In  further 
justification  of  this  procedure,  it  is  to  be  noted  that  the  rates 
of  mortality  deduced  from  the  average  rates  in  the  selected 
groups  are  used  as  a  first  approximation  only,  the  final  rates 
being  arrived  at  by  repeated  comparison  of  the  graduated 
deaths  with  the  actual  numbers  until  a  sufficiently  smooth 
curve  and  a  sufficiently  close  agreement  has  been  obtained. 
At  the  same  time  I  am  not  convinced  that  the  use  of  these 
specially  selected  groups  has  any  real  advantage  over  the  use 
of  groups  of  constant  range,  as  quinquennial  or  decennial, 
provided  the  operator  recognizes  that  he  cannot  look  for  an 
absolute  balance  of  errors  in  these  latter,  but  must  regard 
them  as  equally  subject  to  errors  of  observation  with  the 
numbers  at  individual  asfes. 

Assuming  it  to  be  practicable  to  draw  a  sufficiently 
smooth  curve,  free  from  sudden  changes  of  curvature,  and 
yet  representing  the  observations  sufficiently  closely  with  a 
due  regard  to  their  weight  in  different  parts  of  the  table, 
there  would  appear  to  be  nothing  to  object  to  in  the  principle 
of  the  graphic  method  of  graduation.  In  practice,  however, 
there  are  certain  difficulties.  The  first,  particularly  in  the 
case  of  a  mortality  table,  is  the  question  of  scale.  Anyone 
who  has  attempted  to  make  graphic  graduations  will,  I  think, 
have  met  with  this  practical  difficulty.  Whether  we  graduate 
sepai'ately  the  "  Exposed  to  Risk  "  and  "  Died  ",  or  whether 
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we  graduate  a  function  such  as  q^.,  the  difficulty  equally 
arises.  The  values  of  q^  niay  range  in  practice  from  about 
•005  to,  say,  about  '5,  and  at  the  older  ages  increase  so 
rapidly  that  the  eye  does  not  readily  grasp  the  nature  of  the 
curve.  In  order  that  it  may  do  so,  and  that  the  curve  may 
be  drawn  and  read  off  with  sufficient  accui-acy,  a  certain 
proportion  must  be  maintained  between  the  horizontal  and 
the  perpendicular  scale,  so  that  the  curve  shall  not  cut  the 
ordinates  at  too  acute  an  angle.  It  is  also  necessary  to 
represent  the  values  of  q^  in  two  or  three  sections,  as  the 
scale  suitable  to  the  older  ages  will  not  permit  of  the  values 
at  the  younger  ages  being  represented  with  sufficient 
accuracy. 

Instead  of  operating  on  the  rates  of  mortality,  we  may 
with  advantage  employ  the  logarithms  of  the  rates,  or  the 
loo-arithms  of  the  central  death  rates.*  We  thus  obtain  a 
curve  which  is  much  more  easily  dealt  Avith.  From  the  fact 
that  the  rates  of  mortality  change  slowly  at  the  younger 
ages,  and  at  the  older  ages  generally  approximate  to  a 
geometrical  progression,  the  logarithms  of  the  rates  are 
nearly  in  the  form  of  an  arithmetical  progression,  and  are 
represented  by  a  line  having  very  little  curvature.  At  the 
oldest  ages,  indeed,  it  may  very  conveniently  be  taken  as  a 
.straight  line. 

Perhaps  the  main  difficulty  in  graphic  graduation  is  that 
it  is  by  no  means  easy,  even  with  mechanical  aids,  to  draw  a 
sufficiently  smooth  curve.  The  curve  as  drawn  may  appear 
to  be  smooth,  but  on  reading  it  off  and  examining  the  series 
of  values  obtained,  we  find  irregularities  which,  in  order  to 
produce  a  satisfactory  graduation,  must  be  removed  by  a 
further  adjustment.  If  we  are  dealing  with  a  relatively 
small  experience — in  which  cases  these  practical  difficulties 
are  correspondingly  increased — they  may  be  overcome  to  a 
large  extent  by  using  as  a  base  line  a  well-graduated  standard 
table  representing  an  experience  of  similar  character.  Jiy 
computing  the  "  expected  "  deaths  according  to  the  standard 
table,  and  dealing  with  the  ratio  of  the  actual  to  the 
"expected"  deaths  in  successive  age  groups,  we  avoid  the 
difficulties  due  to  inequality  of  scale  and  to  the  rai)id  increase 
in  the  value  of  the  ordinates  at  the  extreme  ages.     The  curve 

^  See,  however,  Note  B,  \).  I  1  1,  an  to  precautions  in  dealing  with  lugs  of  rates 
of  moiialitv  and  similar  fnnctions. 
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of  ratios,  apart  from  accidental  fluctuations,  will  often  be 
found  to  approximate  to  a  straight  line,  tlie  departures  from 
which  can  be,  of  course,  represented  on  a  relatively  large 
scale.  In  particular,  the  difficulty  arising  from  the  paucity 
of  observations  at  either  end  of  the  table  will  be  avoided  by 
making  each  extremity  of  the  curve  of  ratios  terminate  in 
a  straight  line,  the  locus  of  which  will  depend  upon  the 
general  trend  of  the  curve  in  the  neighbourhood.  The 
resulting  values  at  the  extremes  of  the  table  obtained  in 
this  way  will  be  more  trustworthy  than  those  obtained 
without  the  aid  of  the  standard  base  line."'^ 


In  finite  difference  or  interpolation  methods  of  graduation 
(of  which  we  may  take  Woolhouse's  as  the  best  known  type) 
the  underlying  assumption  is  virtually  the  same  as  in  the 
graphic  method,  viz.,  that  the  curve  is  of  such  a  nature  that 
the  ordinary  methods  of  interpolation  can  be  applied.  Put 
more  precisely,  Woolhouse's  method  assumes  that  for  a  range 
of  15  consecutive  ages  the  values  of  Z.,.  can  be  represented 
with  sufficient  accuracy  by  a  curve  of  the  third  order,  i.e., 
lj.+fz=lj.+  at-\-ht'^  +  ct^  ^\\\e\\  t  is  not  numerically > 7.  As  this 
assumes  the  fourth  and  higher  differences  of  Ij,  to  be  zero,  we 
may  write 

+  7(Z.._3  +  U3)+21(Z.,_2  +  7x+2)  +  24(Z,_,  +  Z^+,)+25Z,} 

where  I'x  may  be  taken  as  the  graduated  value  of  that 
function,  the  quantities  on  the  right-hand  side  of  the  equation 
being  the  ungraduated  values. 

This  fornnila,  which  is  that  used  by  Woolhouse  in  the 
graduation  of  the  H^^  Table,  is  of  course  only  one  of 
numerou.s  possible  formulas  deducible  from  the  above  expression 
for  Ix+t-  Others  may  be  found  resulting  in  a  smoother 
graduated  series,  but  all  the  formulas  since  proposed  as 
improvements  on  his  are  based  upon  the  same  general 
principle.  An  indefinite  number  of  such  formulse  can  be 
found,  even  when  the  range  is  fixed. t     In  particular  may  be 


*  See  Lidstone,  J.I.A.,  xxx,  p.  212.     Tliese  remarks  are  equally  applicable 
to  graduation  l)y  a  finite  difference  formula  {see  J  I. A.,  vol.  xli,  p.  89). 
-]-  See  Todlmnter,  J.I.A.,  xxxii,  378  ;    G.  F.  Hardy,  J. I. A.,  xxxii,  371. 
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mentioned  Mr.  J.  A.  Higliara's,  Dr.  Karup's,  and  that  used 
by  Mr.  J.  Spencer  in  the  graduation  of  the  "  Manchester 
Unity  "  mortality  experience.  See  the  following  table 
showing  the  value  of  ^I'x  in  terms  of  the  ungraduated  w's  : — 

Table   V. 

Showing  the  values  of^t,   where   u'o=1t\utX<f>t,    hij  various  well- 
knoion   Graduation  Formidce. 


Distance 

from  Central 
Term 

Spencer 
2 1 -term 

Karup 

Higham 

Woolhouse 

t 

Formula 

0 

•172 

•200 

•200 

■200 

±  1 

•163 

•182 

•192 

•192 

±  2 

•135 

•139 

•144 

•168 

±  3 

•095 

•085 

•080 

•056 

±  4 

•052 

■034 

•024 

■024 

±   5 

•017 

■000 

■000 

•000 

±  6 

-005 

-■013 

-■OK) 

-•016 

±  7 

-■015 

-■014, 

-■016 

-•024 

±  8 

-•015 

-■010 

-•008 

•000 

±  9 

-•009 

-■003 

■000 

±10 

-•003 

■000 

±11 

•000 

It  is  clear  that  no  such  formula  will  entirely  remove 
the  irregularities  in  the  series,  and  in  AVoolhouse's  graduation 
of  the  H'^^  Table  the  outstanding  irregularities  were  removed 
by  an  empirical  process  similar  to  that  employed  for  the 
graduation  of  the  17  Offices'  Table,  and  described  in  his 
paper  {J.I. A.,  vol.  xii,  p.  140-1).  The  object  aimed  at  in 
a  formula  such  as  these,  should  be  so  to  select  the  coefficients 
of  the  terms  on  the  riglit  hand  that,  while  giving  an 
expression  for  the  value  of  the  central  fmiction  correct  as  far 
as  the  order  of  differences  employed,  the  formula  will 
produce  the  maximum  smoothness  in  the  flow  of  the 
graduated  values.  This  may  l)e  done  by  simple  experiment, 
or  we  may  adopt  some  empirical  measure  or  standard  of 
smoothness  and  thereby  compute  the  most  advantageous 
coefficients.  We  may,  for  example,  adopt  as  our  standard 
of  smoothness  the  extent  to  which  the  second  dilferences 
of  our  graduated  function  are  affected  by  the  errors  of 
observation  in  the  original  table. 

Applying  this  standard  to  Woolhouse's  fornnil:!,   we  have 
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for    the    graduated    second    central    difference    of    l^.   (using 
central  differences  for  the  sake  of  symmetry) — 

+  10ia,.+3+  fca,-+4+  'a-+5  +  'U.-+6  + 4/^+7  —  o/j.^g. 

If  Ave  assume  that  on  the  average  each  of  the  ungraduated 
values  of  l^  on  the  right-hand  side  of  this  equation  is  subject 
to  a  mean  error  of  +  e,  and  if  we  assume  that  these  errors 
may  be  combined  according  to  the  normal  law,  then  the  mean 
error  of  the  entire  expression  for  A^Z'j._i  will  be  found  by 
multiplying  e  by  the  square  root  of  the  sum  of  the  squares  of 
the  coefficients,  giving 

(^/32_|.42  4.12_|.12_^2=^+     &C.)    g^    v/5l0    g^.JO 

125  T25~  ^  ^" 

In  the  same  way  it  may  be  shoAvn  that  in  Karup's  formula 
the  mean  error  in  A^«x-i  is  about  "OGSe,  Avhere  e  is  the 
mean  error  of  a  single  value  of  Uj. .  It  must  not  be  supposed 
from  these  results  that  the  mean  errors  in  the  graduated 
values  of  Ij;  or  ^tj.  are  proportionately  reduced.  The  mean 
errors  in  the  graduated  functions  when  Woolhouse's  formula 
is  employed  are  reduced  to  about  '42  of  the  mean  errors  in 
the  ungraduated  functions,  or  are  about  equivalent  to  the 
mean  errors  of  the  ungraduated  values  corresponding  to  an 
experience  5 4  times  larger.  The  graduated  table  based  on 
the  smaller  data  would,  however,  be  smoother  than  the 
ungraduated  table  based  upon  the  larger  data.  (See  J.I.A., 
xxxii,  pp.  376-7.) 

Taking  a  generalized  formula,  such  as 


u 


'a;=««.f  +  ^(«.f-l  +  ".r  +  l)  +  c(Ku,-_2  +  'i'u-  +  2)+&C.    .   .   T^iUx-t  +  njc  +  t) 


where  u' x  represents  the  graduated  value  of  xij,,  and 
assuming  that  each  of  the  ungraduated  values  Uj;,  &c., 
are  affected  by  the  same  mean  error  +  e,  it  is  of  course 
possible  to  determine  the  values  of  a,  h,  c,  &c.,  so  that  the 
mean  error  in,  say,  ^^u' x-\  shall  be  a  minimum.  Noting  that 
a=l— 26  — 2c  — &c.,  and  that  h-\--ic  +  9rf-i-&c.  =  0,  in  order  that 
the  foi'mnla  may  be  correct  to  3rd  differences,   an  expression 
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may  be  found  for  A-it'^-i  in  terms  of  %._#_,,  Ux-t,  &c.,  with 
coefficients  involving  c,  d,  .  .  .  k.  If  the  coefficients  of  each 
term  are  now  equated  to  zero,  there  will  be  (2^  +  3)  equations 
of  condition  with  (^  —  1)  unknowns,  which  may  be  solved 
by  the  usual  method  of  least  squares. 

This  is  somewhat  theoretical,  however,  as  the  values  we 
should  obtain  for  the  coefficients  would  be  generally 
fractional,  and  the  resulting  graduation  foi*mula  would  not 
lend  itself  to  any  continuous  method  of  computation,  as  is 
the  case  with  Woolhouse's  and  other  similar  formulae.  A.n 
alternative  would  be  to  fix  upon  a  convenient  set  of 
summations,  and  then  to  determine  the  function  summed 
(called  by  Mr,  Lidstone  the  "  operand")  so  that  (1)  first  and 
second  differences  may  vanish — see  J.I. A.,  xxxii,  371,  &c. ; 
(2)  The  range  of  the  formula  may  be  what  we  require ; 
and  (3)  that  subject  to  (1)  and  (2)  the  coefficients  shall  be 
such  as  to  make  the  mean  error  in  A-  or  A^  a  minimum. 
This  might  give  a  fairly  convenient  working  formula,  as 
when  once  the  operand  was  formed  the  ordinary  convenient 
method  of  summation  would  apply. 

If  we  consider  the  effect  of  such  a  formula  of  graduation 
upon  the  outstanding  or  unbalanced  errors  of  observation  in 
a  small  group  of  ages,  we  shall  see  that  they  are  not  very 
materially  diminished.  If,  for  example,  we  express  the  sum 
of  five  consecutive  graduated  values  in  terms  of  the 
ungraduated  values,  we  shall  have,  in  the  case  of  Woolhouse's 
formula, 

+  115/^+101Z^+,  +  80^,.+2) 

+  terms   involving   other    values    of    /. 

Here  it  is  obvious  that  any  systematic  or  unbalanced  error  in 
the  original  group  will  not  be  greatly  reduced  (probably 
to  about  three-fourths  of  its  amount)  in  the  graduated  table. 
While,  therefore,  finite  difference  formuho  of  graduation 
yield,  generally,  a  smooth  curve  as  regards  the  pi'ogression  of 
the  graduated  values  from  age  to  age,  they  have  a  tendency 
to  reproduce  any  waviness  in  the  original,  due*  to  the 
unbalanced  errors  affecting  small  groujis  of  four  or  five 
consecutive  ages. 
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A  question  arises  in  connection  Avitli  this  method  as  to 
what  particular  function  should  be  selected  for  graduation. 
In  the  case  of  Woolhouse's  original  formula  the  function 
operated  upon  was  l^ .  Practically  speaking,  except  for  the 
latter  portion  of  the  table,  this  approximates  in  result  to  a 
graduation  of  the  rates  of  mortality.  This  may  be  seen  from 
the  following  relations.  Any  adjustment  of  the  Ix  column  by 
a  finite  difference  formula  has,  of  course,  the  same  effect  as 
a  similar  graduation  of  the  cZ.,.  column.  Since  dj.:=lj.qj.,  and 
since  for  the  range  of  ages  included  in  the  formula  (fifteen 
in  Woolhouse's  formula,  of  which,  however,  only  the  five 
central  ages  are  heavily  weighted]  the  values  of  l^  are  not 
in  general  widely  different,  the  graduation  of  the  Ij.  or  dj; 
column  should  give  results  not  materially  different  from  those 
obtained  by  graduating  q^..  At  the  older  ages,  however, 
there  may  be  significant  differences  in  the  results,  and  I  must 
express  my  preference  for  the  rate  of  moi-tality  as  the  more 
suitable  function  to  graduate  if  the  observations  are  duly 
weighted  or  if  proper  precautions  are  taken  to  avoid 
anomalous  results  at  either  end  of  the  table  where  data  are 
scanty. 

An  objection  to  the  principle  of  the  finite  diff'erence 
methods  of  graduation  is  that  the  weight  of  the  observations 
is  not  allowed  for  at  various  ages.  This  objection  is  not  very 
serious,  howevei',  as  at  the  commencement  and  end  of  the 
table,  where  it  would  be  chiefly  felt,  the  method  is  usually  not 
strictly  applied.  It  may  be  noted  that  if  the  l^  function  be 
graduated,  then  its  rapid  decrease  in  value  at  the  oldest  ages 
in  the  table  gives  automatically  a  diminishing  weight  to  the 
observations  with  increasing-  age,  but  at  the  same  time  yields 
somewhat  irregular  graduated  values.  The  objection  may,  of 
course,  be  got  rid  of  by  first  applying  a  smooth  series  of 
weights  to  the  function  to  be  graduated,  prior  to  graduation, 
.and  eliminating  these  factors  afterwards. 

A  difficulty  arises  in  the  use  of  finite  difference  formulae 
from  the  smallness  of  the  data  at  the  extremes  of  the  table 
and  from  the  fact  that  the  first  7  or  8  values  of  the 
graduated  function  cannot  be  obtained  from  the  formula.  In 
the  case  of  a  mortality  table  there  is  not  so  much  difficulty 
in  dealing  with  extreme  old  age,  because  there,  as 
Woolhouse  points  out,  if  we  are  dealing  with  the  function  Ij.  it 
may  be  taken  =0  beyond  the  limiting  age  of  the  table,  or  if 
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we  are  graduating  tlie  rate  of  mortality^  ^j.  may  be  put  down 
as  equal  to  unity.  As  regards  the  earlier  ages,  Woolhouse's 
method  is  to  obtain  from  the  formula  the  graduated  values 
of  Ij;  so  far  as  this  can  be  done,  that  is,  to  within  7  years  of 
the  initial  age,  and  to  compute  the  values  for  the  first  seven 
ages  of  the  table  from  the  values  of  /q,  /;,  I3  and  Zg 
{If,  representing  the  value  of  Z^  for  the  initial  age)  on  the 
assumption  of  a  constant  third  difference.  This  method  may 
in  certain  cases  lead  to  anomalous  results,  even  negative 
rates  of  mortality.  Mr.  Ackland  has  given  an  alternative 
method  of  considerable  ingenuity  (/.J.^.,vol.xxiii,p.  357).  The 
diflBculty  may  be  avoided  by  assuming  values  for  the  initial 
ages,  as,  for  example,  a  constant  average  value  of  q^.  or  d^,  or 
other  arbitrary  values  deducible  from  the  general  character 
of  the  experience.  A  more  satisfactory  method  would  be  to 
determine  q^  for  the  first  10  or  15  ages,  by  the  method  of 
moments  or  least  squares,  on  the  assumption  that  it  could 
he  represented  by  a  first  or  second  difference  function.  All 
these  methods,  however,  are  expedients  more  or  less 
empirical,  though  they  may  in  practice  lead  to  sufficiently 
satisfactory  results. 

The  Finite  Difference  methods  of  graduation  all 
assume  that  the  functions  to  be  graduated  may  be  repre- 
sented for  successive  small  tracts  of  ages  b}"  a  parabolic  curve 
of  the  form — 

u_f  =  a  +  bx  +  cx^+  &c. 

We  are  not  bound  to  assume  this  particular  form  of 
function.  We  can  employ  the  jorinciple  of  the  Interpolation 
method,  representing  our  function  by  some  other  form, 
as,  for  example,  mj;  =  a  +  hc^  corresponding  to  Makeham's 
formula. 

The  principle  of  the  methods  of  graduation  we  have  been 
discussing,  of  whicli  Woolhouse's  is  a  typo,  must  not  be 
confounded  with  that  used  by  Davies  in  graduating  the 
Equitable  experience,  nor  with  that  used  by  Mr.  Berridge 
in  graduating  the  Peerage  mortality.  These  latter  are  more 
nearly  allied  to  graduation  l)y  frequency  curves  than  to 
Woolhouse's  method.  In  Davies'  Equitable  graduation, 
curves  of  the  third  order  are  actually  fitted  to  successive 
sections  of  tlic  I,r  fohuiin,  tlie  values  of  Z^  from  10  to  40  being 
virtually  found   by  a  third   difference   interpolation   from  tlio 
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values  lio,  ho,  I30,  ho,  those  from  1^^  to  Z70  similarly  from  the 
values  of  Uo,  ho,  ho,  ho,  ^^^^  so  on.  Mr.  Berridge's  graduation 
of  the  Peerage  mortality  followed  a  similar  principle,  except 
that  he  represented  the  entire  series  of  values  of  log  Ix  from 
15  to  75  by  means  of  a  single  curve  of  the  sixth  order,  based 
upon  the  values  of  that  function  for  decennial  intervals  of  age. 
As  to  the  relative  merits  of  graphic  and  finite  difference 
methods  of  graduation,  the  former  has  an  undoubted  advantage 
when  the  number  of  facts  at  our  disposal  are  few.  In  these 
cases  formulae  of  the  type  of  Woolhouse's  cannot  be  expected 
to  produce  very  satisfactory  results,  as  in  the  comparatively 
small  section  of  the  curve  embraced  by  the  formula  the  true 
character  of  the  curve  will  frequently  be  obscured  by  the 
errors  of  observation.  These  formulas  are  at  their  best  when 
applied  to  a  table  based  upon  fairly  extensive  data,  and 
presenting  a  curve  without  any  rapid  change  of  character. 
The  advantages  possessed  by  the  graphic  method  in  dealing 
with  a  small  experience,  owing  to  its  flexibility  and  its  power 
of  bringing  under  contribution  large  sections  of  the  curve  at 
once,  are,  however,  still  more  noticeable  when  frequency 
curves  can  be  suitably  employed. 


We  have  already  spoken  of  the  success  or  sufficiency  of  a 
graduation,  but  we  have  not  said  anything  as  to  Avhat  is  the 
proper  test  of  a  successful  graduation.  Before  dealing  with 
the  general  principle  of  graduation  by  means  of  frequency 
curves,  it  will  be  useful  to  consider  this  question.  There 
are  obviously  two  conditions  that  should  be  fulfilled  by  a 
graduation.  In  the  first  place,  a  smooth  and  continuous 
progression  in  the  graduated  values.  This  is  required  because 
we  have  good  reason  for  believing  that  if  the  true  values  were 
ascertainable,  they  would  exhibit  this  property.  In  the 
second  place  we  require  an  adherence  to  the  original  data, 
sufficiently  close  to  be  fairly  within  what  we  may  conveniently 
term  the  errors  of  observation. 

The  standard  of  smoothness  is  not  easy  to  define.  If  a 
formula  is  adopted  representing  the  ultimate  values  of 
^x,  9x,  or  fi-c  as  a  function  of  the  age,  this  in  itself  secures 
a  smooth  series.  In  other  cases  the  sufficiency  or  otherwise 
of  the  graduation  in  this  respect  must  be  left  to  individual 
judgment.     The    advantages   of   a  really  smooth  curve   are 
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mainly  found  where  it  is  necessary  to  resort  to  interpolation 
or  to  the  use  of  summation  formulse  ;  and,  further,  in  the 
practical  consideration  that  with  a  really  smooth  curve  nearly 
all  tables  calculated  therefrom  can  be  sufficiently  checked  by 
differencing. 

As  regards  the  second  requirement,  that  of  adherence  to 
the  general  features  of  the  ungraduated  experience,  it  is 
easier  to  set  up  a  criterion.  We  have  already  seen  that  if 
the  true  value  of  the  probability  of  an  event  happening  at  a 
single  trial  is  p,  the  event  will,  on  the  average,  happen  np 
times  in  n  trials,  and  if  there  are  series  of  Hi,  112,  na,  &c., 
trials  in  which  the  probabilities  of  the  respective  events  are 
Pi,  T'l,  "P-i,  &c.,  then  on  the  average  the  total  number  of 
occurrences  in  such  a  series  of  trials  will  be  iiipx  +  ')i2p2  + 
''^3i?3  +  ,  &c.  That  is  to  say,  if  the  observed  occurrences 
are  6^,  Oo,  d-s,  &c.,  then  the  average  value  of  each  term 
(^Q^—mpi),  [6.2— n^p^,  &c.,  and  consequently  of  the  sum  of 
such  terms,  Avill  be  zero.*  It  is  also  obvious  that  the  average 
value  of  the  sum  of  the  series  (^1— 711^1)  + 2  (^2— '^2^2) + 
^{Bz—nzp'^+,   &c.,    and    generally  of   the    series  whose    rth 

term  is 

\r 

----    {dr  —  nrPr) 


\t\t 


will  be  zero.  In  the  case  of  a  mortality  experience  these 
quantities  {Oi—nipx),  &c.,  represent  the  deviations  of  the 
observed  deaths  at  each  age  from  the  "  Expected  Deaths  ", 
as  computed  by  the  true  rates  of  mortality,  supposing  these 
to  be  known.  It  follows,  therefore,  that  we  should  expect 
the  total  of  such  deviations  on  the  average  to  be  zero,  and 
in  the  same  way  the  average  value  of  the  successive  sums 
of  the  accumulated  deviations  should  be  zero.  Generally, 
if  we  put 

S?i  =  ?lo  +  ?ll  +  ??2  +  «3  +  ,    &C. 

Sin  =  t^n  =  ni  +  2n2  +  3?i3  +      ,  &c. 
%lSn  =  i^n  =  72^  +  37Z3  +  6n4  + ,  &c. ; 
we  shall  have  on  the  average 

t'{dr-n,.pr)  =  0. 

•  This  is  not  tlio  moMi  prohahlp.  valiu!  of  tlicso  terms,  altliongli  in  general 
it  will  bo  very  close  thereto.  The  Actuarj',  however,  requirca  to  consider 
the  averarfe  result,  not  the  most  probable. 
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We  should  not  expect  (assuming  the  true  values  of  p^  to 
be  known)  that  these  sums  of  the  deviations  of  the  actual 
from  the  expected  numbers  would  actually  be  equal  to  zero 
in  any  given  case,  but  we  should  expect  in  a  long  series  of 
cases  that  the  positive  values  would  approximately  balance 
the  negative.  We  do  not  expect  to  obtain  exactly  1,000 
heads  in  a  series  of  2,000  tossings  of  a  coin,  but  we  should 
expect  to  find  that  the  average  number  of  heads  over  a  great 
number  of  such  series  of  tossings  would  be  very  close  to  that 
figure.  This  reasoning  leads  us  to  the  conclusion  that, 
given  a  successful  graduation,  we  should  not  only  have 
obtained  a  smooth  series,  but  that  the  sum  of  the  deviations 
between  the  computed  events  (deaths  or  otherwise)  and 
the  observed  numbers,  would  be  nearly  zero,  and  that 
the  successive  sums  of  the  accumulated  deviations  would 
be  small. 

It  is  not  neicessary  in  practice  that  this  test  should  be 
pushed  too  far.  We  may  be  satisfied  if  the  sum  of  the 
deviations  and  the  sum  of  the  accumulated  deviations  are 
practically  zero ;  if  the  total  deviations  in  successive  sections 
of  the  table  {e.g.,  in  quinquennial  or  decennial  groups)  appear 
to  be,  on  the  whole,  within  the  limits  of  the  errors  of 
observation ;  and  if  the  total  of  the  accumulated  deviations 
changes  sign  fairly  frequently.  On  the  other  hand  we  should 
expect  that  the  total  deviations  irrespective  of  sign  should 
not  be  materially  less  than  their  theoretical  amount. 
Otherwise  we  should  conclude  that  the  series  was  under- 
adjusted  and  that  accidental  fluctuations  in  the  curve  had 
been  incorporated  as  inherent  characteristics. 

These  tests  of  a  graduation  are  Avell  known  to  Actuaries, 
and,  indeed,  have  been  very  generally  employed  by  them. 
So  far  as  they  go,  they  correspond  to  the  method  of  moments 
which  Prof.  Karl  Pearson  has  elaborated  and  employed  with 
such  success  in  the  fitting  of  frequency  curves  to  statistical 
data.  It  is  clear,  however,  that  they  can  only  be  employed 
systematically  in  conjunction  with  those  or  other  curves 
capable  of  analytical  expression.  Using  methods  of  gradua- 
tion, based  upon  Finite  Difference  formulas,  such  as 
Woolhouse's,  we  cannot  secure  that  the  successive  sums  of 
the  deviations  shall  vanish,  though  in  general  we  may  expect 
them  to  be  small.  Using  the  graphic  method,  we  can,  by  a 
gradual    process    of    hand-polishing   the    curve,  reduce   the 
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accumulated  deviations  and  their  sum  to  as  small  a  value  as 
we  please,^  but  the  process  is  a  tedious  one. 

A  second  test  that  has  occasionally  been  applied  when  the 
graduation  has  been  effected  by  means  of  a  fox-mula,  is  that  of 
making  the  sums  of  the  squares  of  the  deviations  a  minimum, 
the  deviations  being  either  in  respect  of  the  graduated  and 
observed  deaths  at  each  age  or  those  of  the  graduated  and 
ungraduated  values  of  some  function  such  as  4  or  log.  4, 
This  method,  known  as  the  method  of  "Least  Squares",  is 
used  very  generally  in  connection  with  measurements  in 
astronomy  and  other  physical  sciences  and  has  given  rise  to 
a  quite  extensive  literature.  It  is  based  upon  the  assumption 
that  if  in  a  given  series  of  observations  the  relative  frequency 
of  an  error  x  at  each  observation  is  represented  by  the 
function  ke~'^'''^' ,  then  the  probability  of  a  conjunction  of  any 

*  It  may,  perhaps,  be  worth  poiutiiig  out  that  if  we  have  obtained  a 
smooth  cui-ve  with  a  general  conformity  to  the  original  facts,  but  not  making  the 
2  (deviations)  or  5-  (deviations)  vanish,  this  may  be  done  by  the  following  plan. 
Assume,  for  the  sake  of  illustration,  that  the  function  graduated  is  the  central 
death  rate  mx .  Representing  by  ntx  the  graduated  values  of  that  function  by 
El  the  "Exposed  to  Risk"  in  the  middle  of  the  year  of  age  and  by  ^*  the 
.observed  deaths,  let 

2(«»^E^-ex)  =  A 

'S.%mxEx-ex)  =  B 

then,  if  m'x  =  a  +  (1  +  h)mx  be  the  modified  rates  required, 

«.2(Ex)  +  i2(E^W;c)=-A 

a.X-CEx)  +  br-(Exnh)=-'B 

■whence  a  and  b  are  detennined. 

If  the  table  on  the  whole  follows  Makeham's  law  the  use  of  this  form  of 
correction  enables  us  to  neglect  all  orders  of  diiferences  in  the  preliminaiy 
adjustment  of  nix  or  ^^.  Fornmlfe  may  thus  be  employed  (as  for  example,  a 
simple  double  summation  in  groups  of  10  values,  or,  still  better,  successive 
summations  in  lO's,  ij's  and  2's)  giving  a  nuich  smoother  curve  than  when 
account  has  to  be  taken  of  second  differences,  the  resulting  systematic  error  of 
this  first  graduation  being  corrected  as  above. 

In  the  alternative,  if  m'j:  =  mx  +  a  + bx, 

a2(E^)  +  62x(E,)=-A 

b-S.-(E^)  +  br-Jc{Ex)=-li. 

This  method  may  bo  employed  in  conjunction  willi  .Mr.  Lidslonu's  plan  oi  using 
JO,  standard  lahle  as  a  baa3  liiio  for  jjurposes  of  graduation. 

D    2 
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set  of  errors  Xi,  a'2,  X3,  &c.,  will  be  proportional  to  the  value 
of  tlie  product 

=  e~\         c"         / 

which  clearly  has  a  maximum  value  when  the  index  of  e  is 
numerically  a  minimum,  i.e.,  when  the  sum  of  the  squares 
of  the  errors  {xl  +  xl  +  xl  +  &c.)  is  the  least  possible.  This 
expression  assumes  that  the  average  error,  and  therefore  the 
probability  of  a  unit  error,  in  each  observation  is  the  same, 
an  assumption  Avhich  may  often  be  fairly  made  in  respect  to 
independent  measurements  of  a  physical  quantity.  If  the 
observations  are  not  of  the  same  weight,  so  that  the 
pi'obability  of  the  errors  of  Xi,  x-i,  Xi,  &c.,  in  the  respective 
measures  are 

then  the  most  probable  solution  will  evidently  be  that  which 
makes  the  sum  of  these  exponents  the  least  possible.* 

The  assumptions  upon  which  this  method  is  based  are  not 
strictly  in  accord  with  the  conditions  of  a  mortality  experience 
or  similar  statistical  observation.  If  the  method  is  applied  to 
the  deviations  between  the  observed  and  graduated  deaths, 
the  objection  may  be  raised  that  the  observations  at  different 
ages  are  not  of  equal  weight,  and  that  the  probability  of  a 
unit  error  varies  at  each  successive  age,  while  in  each  case 
the  probability  of  a  given  error  can  only  be  approximately 
expressed  by  the  normal  function  A;e~*"*'',  positive  and 
negative  errors  not  being  equally  probable.  It  is,  of  course, 
possible  suitably  to  weight  the  observations,  so  that  a 
unit  error  is  made  equally  probable.  For  example,  if  at 
any  given  age  there  are  n  "exposures",  and  if  the  true 
probability  of  death  is  q,  then  the  "  standard  deviation  "  or 
^average  square  deviation  =nq{l  —  q),  and  the  probability 
of  a  difference  of  x  between  the  expected  and  observed 
deaths  is  approximately  Ke~'''^'^'^^^~'^^ ;  the  error  in  the  formula 
when  X  is  positive  nearly  compensating  the  error  when  x  is 
negative.  Hence,  if  the  "  Exposed  to  Risk  "  and  "  Died  "  at 
each  age  are  multiplied  by  the  factor  [nq{l  —  q)^~^^,  where  q 

*  See  Note  C,  p.  117. 
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is  to  he  taken  at  its  true  or  graduated  value,'^  then  the 
observations  may  be  considered  to  be  properly  weighted  for 
the  application  of  the  method  of  least  squares. 

We  shall  see  in  the  following  lectures  that  there  is  an 
intimate  relation  between  the  criteria  of  least  squares  and 
moments.  This  will  be  better  discussed  after  considering  the 
question  of  frequency  curves  and  the  process  of  fitting  them 
to  a  set  of  statistical  observations. 

*  The  imgraduated  values  of  q  cannot  be  used,  as  this  would  result  in  undue 
■weight  being  given  at  all  ages  where  the  observed  mortality  was  in  excess  of  the 
average,  and  insufficient  weight  where  it  was  in  defect.  Consequently,  the 
mortality  table  resulting  from  this  process  would  on  the  whole  overestimate  the 
mortality  throughout.  Li  other  words,  the  use  of  the  unadjusted  values  of  q 
introduces  a  systematic  or  "biassed"  error  into  the  calculations.  If  this  is 
avoided,  however,  a  verj-  rough  approximation  to  the  graduated  cun-e  of  q  wiU 
give  weights  sufficiently  near  the  truth  for  practical  purposes,  as  a  slight  change 
in  the  relative  weights  of  a  given  series  of  obsei-vations  produces  but  little  result 
upon  the  final  solution. 


THIRD   LECTURE. 


JL  PROPOSE  in  tlie  present  lecture  to  consider  generally  the 
use  of  frequency  curves  in  relation  to  actuarial  statistics. 
We  have  seen  that  the  graphic  method  of  dealing  with  these 
statistics,  as  also  methods  based  upon  finite  difference  formulae, 
assume  only  that  the  true  law  of  the  series,  if  known,  would 
be  found  to  be  represented  by  a  continuous  curve  amenable  to 
the  ordinary  processes  of  interpolation.  It  is  often  possible, 
however,  to  see  that  the  ungraduated  series  can  be  well 
represented  by  a  curve  of  a  certain  distinct  character,  and 
when  this  is  found  to  be  the  case  more  satisfactory  results  are 
obtained,  particularly  where  the  data  are  few,  by  fitting  to  the 
original  series  a  curve  corresponding  to  its  observed  general 
character,  so  determining  the  constants  in  the  equation  of  the 
curve  as  to  secure  the  closest  agreement  with  the  ungraduated 
curve.  If  for  example  we  turn  to  the  series  in  column  (2)  of 
Table  I,  it  will  be  at  once  seen  that  the  general  character  of  the 
series  accords  very  closely  to  the  "  normal "  frequency  curve, 
or  to  some  curve  having  the  same  general  features.  When 
we  find  that,  by  giving  suitable  values  to  the  constants,  a 
frequency  curve  can  be  made  to  fit  the  observations  within 
the  limits  of  the  errors  of  observation  we  may  be  satisfied  that 
the  graduated  curve  thus  j^roduced  is  probably  a  better 
representation  of  the  original  than  any  that  Avould  result  from 
a  graphic  or  finite  diiference  method  of  graduation. 

Any  curve  which  exhibits  the  law  of  variation  in  a 
particular  function,  svich  as  a  table  of  l^,  d^  or  /x^..,  may  be 
considered  for  our  purpose  as  a  frequency  curve.  The 
expression  is  usually,  however,  confined  to  that  class  of  curves 
which  experience  seems  to  show  to  be  specially  applicable  to 
the  observed  distributions  of  deviations  from  mean  values  in 
statistical  tables.     We  have  already  seen  examples  of  such 
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tables  where  the  frequency  of  the  deviations  of  measures  from 
their  mean  value  follows  certain  comparatively  simple  laws. 
Professor  Karl  Pearson  has  examined  a  considerable  variety 
of  statistical  data  (mainl}^  but  not  entirely,  biological)  and 
finds  that  in  practically  all  the  cases  examined  the  distribution 
of  the  various  measurements  may  be  represented  fairly  closely 
by  one  or  other  of  the  class  of  curves  derived  from  the 
differential  equation 

l,dy^_Jx-^ H) 

y    dx       a  —  hx — cx^  •     •        \   J 

where  x  represents  the  magnitude  of  a  given  deviation 
from  the  mean  of  a  series  of  measures  and  y  the  frequency 
of  such  deviation. 

As  this  group  of  curves  is  of  considerable  importance, 
though  less  so  perhaps  in  relation  to  actuarial  than  in  relation 
to  some  other  classes  of  statistics,  it  is  convenient  to  consider 
them  first.  It  is  not  necessary  here  to  discuss  these 
curves  analytically ;  the  student  may  be  referred  to  the 
original  papers  of  Professor  Karl  Pearson"^,  or  to  all 
admirably  condensed  resume  by  Mr.  Robert  Henderson  in 
the  Journal  of  the  Actuarial  Society  of  America,  reprinted 
J.I.A.,  xli,  429-442;  and  to  Mr.  W.  Palin  Elderton's  treatise  on 
"  Frequency  Curves  and  Correlation "  in  which  Professor 
Pearson's  methods  are  fully  described.  'J"'he  table  at  the  end 
of  these  lectures,  which  gives  a  sufficiently  complete  summary 
of  such  of  the  algebraical  properties  of  these  curves  as 
are  mo.st  useful  in  practice,  is,  with  some  unimportant 
modifications,  based  upon  that  given  by  Mr.  Henderson  in 
his  paper.  It  will  be  sufficient  for  our  present  purpose- 
to  give  a  brief  general  description  of  these  curves  and  of 
their  use  in  connection  with  actuarial  data. 

We  have  already  seen  that  the  general  character  of  curves, 
such  as  those  of  Taljles  I  and  II,  is  approximately  detennined 
by  the  average  value  of  the  squares  and  cubes  of  t}\& 
deviations  of  the  variable  from  its  mean  value;  the  former- 
giving  a  measure  of  the  compactness  or  diffuseness  of  the 
curve  that  is  of  the  average  extent  of  the  deviations  from  the 
mean  irrespective  of  their  direction  ;  the  latter  a  measure  of 
their  departure  from  symmetry,  or  of  the  "  skewness  ",  of  the 
curve.     It  will   be   useful   at  this  point  somewhat  to  extend 

*  PhU.  Tram<,  vol.  ISd,  p.  343 ;  :yoU197,  p.  413,  &c. 


area=     vdx. 
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this  general  statement,  and,  before  proceeding  to  a  description 
of  particular  curves,  to  explain  more  in  detail  what  is 
meant  by  the  "  moments  "  of  a  curve. 

If  we  suppose  y  =f(x)  to  represent  the  equation  to  a 
given  curve,  x  varying  between  the  limits  h  and  k,  the 
total  area  of  the  curve  will  be  represented  by  the 
expression  : 

.=     ydi 

Jk 

We  may  suppose,  for  instance,  to  give  definiteness  to  our 
ideas,  that  the  function  y  represents  the  numbers  under 
observation  between  age  x  and  x  +  dx,  the  number  of  "years 
of  life"  observed  between  these  ages  being  ydx,  and  the  area 
of  the  curve,  the  sum  of  all  these  quantities,  being  the  total 
years  of  life  observed  at  all  ages.  If  we  now  multiply 
each  value  of  ydx  by  the  corresponding  age  x  and  divide  the 
total  of  these  products  by  the  total  number  of  the  "  exposed  ", 
we  shall  have  the  average  age  of  the  Avhole.  Put  into 
symbols : 

rh  rk 

xydx-^     ydx  =  a:Yera,ge  value  of  a;.     .     .     .     (2) 

Jk  J/i 

=  lst  moment  of  the  curve  round 
the  ordinate  for  which  x  =  0. 

=  mj,  say. 
Similarly, 

rh  rh 

I  x^''y.dx-^\  y.dx  =  average  value  of  a;'* 

=  7ith  moment  round  ordinate  for 
which  a;  =  0. 

=  m„. 

The   moments    of   the    curve   may   be    taken   round  any 

ordinate   we    please.       If,    for   example,  the   average   value 

of   X   as   found   by    equation    (2),   is   Xi,   then    the    ordinate 

corresponding  to  this  value  of  x  passes  through  the  centre 

of  gravity  of  the  curve,  and  is  termed  the  "centroid  vertical." 

In   general  it  is  most  convenient  to  take  the  value  of  the 

moments  of  the  curve  round  this  centroid  vertical,  for  Avhich 

obviously  the  first  moment  vanishes.     The  expression  for  the 

nth.  moment  round  this  ordinate  then  becomes : 

rh  rh 

{x—xi^ydx^^  ydx  =  fjLn    .     .     •     •     •     (3) 
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the  average  value  of  the  nth.  power  of  the  deviations  (^x  —  Xi) 
between  the  values  of  x  and  the  mean  value.  When  the 
moments  of  a  curve  are  spoken  of  without  qualification, 
it  will  be  understood  that  they  are  the  moments  round 
the  "  centroid  vertical."  These  moments  are,  of  course, 
those  already  referred  to  in  Lecture  I.,  p.  7,  as  representing 
the  sums  of  the  powers  of  the  deviations  of  x  from  its 
mean  value. 

The  following  formulae,  which  maybe  readily  demonstrated,^ 
connect  the  values  of  the  moments  round  the  "  centroid 
vertical"  with  the  moments  round  the  ordinate  for  which 
x  =  0.     Using  the  same  notation  as  above,  we  have 


IM3  =  ma— Smima  -f  2  (mj)^ 

yLt4  =  m4  — 4mim3+6(mi)2m2— 3(mi)^ 


.     (4) 


where  the  law  of  the  coefficients  is  sufficiently  obvious. 

For  the  particular  family  of  curves  arising  from  the 
differential  equation  (1)  formulaj  may  readily  be  found 
for  the  moments  involving  the  various  constants  of  the 
curves,  and  inversely,  the  values  of  the  constants  can  be 
expressed  in  terms  of  the  moments.  The  formulae  for 
the  higher  moments  being  sometimes  complicated,  it 
is  more  convenient  to  tabulate  certain  functions  of  the 
moments,  e.(j.  : 


/3i 


_W 


/^4 


if^r  ^'-  i^r' 


7= 


ye,  +  4 
/32  +  3 


from  whicli  the  constants  of  the  curves  may  be  obtained  more 
readily,  wliicli  are  also  useful  in  discriminating  between  the 
curves  apj)licable  to  a  given  set  of  observations. 


See  Eldertoii,  ]>.  17-1'J  ;  Hciidc  rsdii,  J.I.A.,  xli,  ■K$l-2. 
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The  various  curves  arising  from  the  differential  equatixjn 
(1)  may,  for  our  present  purpose,  be  conveniently  classified 
as  under  : — 

Class    I.     Symmetrical  curves.     Range  limited. 
„      II.  „  „  „       unlimited. 

„    III.     Skew     curves.       Range    limited    in     both 

directions ; 
„    IV.     Skew     curves.       Range     limited     in     one 

direction  ; 
„      V.     Skew    curves.     Range   unlimited   in   either 
direction ; 
the  various  types  of  curve  being  as  follow.     It  will  be  seen 
that  some  of  these  Classes  are  repesented  only  by  a  single 
type  of  curve : 

Class  I.  Symmetrical  curves  of  limited  range. — In  this 
class  we  have  only  the  single  curve. 

2\m 


Type  I.  ^  y  =  «(l-y 


The  values  of  x  range  from  +a  to  —a,  for  either  of 
which  values  of  the  variable  y  becomes  zero. 

The  average  value  of  x  is  obviously  zero,  the  corresponding* 
ordinate  y  is  a  maximum,  and  clearly  bisects  the  area  enclosed 
between  the  curve  and  the  axis  of  x.  In  other  words,  the 
"mean",  "mode",  and  "median  "  of  the  curve  all  coincide, 
as  in  all  symmetrical  curves. 

The   second  moment  of    the  curve 

_     _      a^ 


2m +  3 

and  the  "standard    deviation" 

a 


y2m  +  3 
The  fourth  moment 


oa 


Im  +  b 

The  value  of  w  will  usually  be  positive  when  y  equals  zero  at 
both  limits.  If  m>0<l  the  curve  cuts  the  base-line  at  an 
angle.  If  m  is  negative  the  value  of  y  becomes  infinite  at 
both  limits,  and  m  is  always  >  —  1.  - 

This  curve  has  a  close  relationship  with  the  symmetrical 
point  binomial  curve,  whose  terms   are  proportional   to   the 
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terms  in  the  expansion  of   (4  +  ^)",  the  general  term  of  which 

may  be  written  ^ 

y= ■. 

^      n        \n 


[It  will,  of  course,  be  understood  that  the  ac's  in  these 
formulse,  and  in  others,  are  not  identical,  but 
simply  stand  for  some  constant  in  each  case,  the 
numerical  value  of  which  is  determined  by  the 
area  of  the  curve.] 
The  binomial  curve,  however,  can  be  conveniently  used 
only  to  represent  the  definite  points  corresponding  to  integral 

values  of    -  +x,  whereas    Type    1    represents    a    continuous 

curve  (Note  D,  p.  122),  The  data  with  which  an  actuary  has 
to  deal  are  generally  in  the  latter  form,  for  example,  the 
numbers  living,  the  number  of  deaths,  withdrawals,  &c., 
between  the  ages  x  and  x+l,  and  although  usually  the 
number  of  terms  in  the  series  is  so  considerable  that  the 
curve  may  be  treated  as  a  series  of  points,  on  the  other  hand, 
a  binomial  having  so  many  terms  will  not  generally  be  found 
a  suitable  curve  to  employ.  In  most  instances  where  a  series 
can  be  fairly  represented  by  the  symmetrical  binomial,  it  can 
also  be  fairly  represented  by  Type  1,  with  possibly  some 
slight  difference  in  range,  as  will  be  seen  later. 

There  are  other  symmetrical  curves  of  limited  range, 
which  are  in  the  nature  of  frequency  curves,  but  which  do 
not  belong  to  the  family  of  curves  derived  from  equation 
(1)  :  such,  e.g.,  as  the  curve 


whicli,  however,  we  need  not  discuss  here. 

C/a.V6'  //.  Symmefrical  curves  of  unlimited  range. — In  this 
class  are  two  curves  lielonging  to  the  family  with  which  we 
are  dealing. 

Tyj>e  2.  7y  =  /ce-'^=/'^= (5) 

This  is  the  curve  of  "facility  of  error",  or  the  "  normal  " 
frequency  curve. 

The  average  value  of  x  is  clearlj'  zero,  corresponding  to 
the  "mode"  or  the  maximum  value  of  y,  and  to  the  median. 

(,•2 

The     second    moment      =^=— ,     and      the      standard 

deviation  =  —7    . 

v/2 
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■     Type   1    evidently    transposes  into   this    curve  when   the 
vahie    of    a,    and   hence   the    range    of   the    curve,   is   made 


a^ 


indefinitely  great.     If  we  put   -  =  c^,  making  both  a^  and  m 
indefinitely  great,  but  their  ratio  finite,  we  have 

,  ^2v  TO 

Limit  (l-y   =e-^l^' (6) 

Even  when  the  range  of  the  curve  is  not  great,  that  is 
when  TO  and  a^  are  not  large  numbers,  there  is  a  fairly  close 
agreement  between  curves  of  Types  1  and  2  and  the  symmetrical 
binomial. 

This  may  be  seen  by  a  numerical  example,  the  folloAving 
table  showing 

1 .  The  values  oi  y=     '    '    for  integral  values  of 


X,  these  values  being  proportionate  to  the  terms  in 

/I       1\^ 
the  expansion  of  the  binomial  (  9  +  o  )  • 

2.  The  values  of  y  =  ^Q^(l-^^^\ 

8.  The  values  of  y  =  l,026e-^=/-., 

the  constants  in  the  two  latter  curves  being  chosen  to  give 
as  good  general  agreement  as  practicable  with  the  binomial 
curve. 

Table  VI. 

Shoioing  Similarity  of  Types  1  and  2  to  the  Symmetrical  Point 

Binomial. 


Values  of 

Binomial  curve 

Type  1 

Type  2 

Variable 

36000 

y-993(l-|-J 

(3) 

X 

V=  1 

^      |3+a-|3-a; 

.V  =  1026e--^".-'-i 

(1) 

(••;) 

(4) 

-4 

0 

2 

6 

-3 

50 

47 

56 

-2 

300 

303 

282 

-1 

750 

752 

743 

0 

1,000 

993 

1,026 

1 

750 

752 

743 

2 

300 

303 

282 

3 

50 

47 

56 

4 

0 

2 

3,201 

6 

Totals      ... 

3,200 

3,200 
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Had  tlie  range  of  the  curves  been  gi-eater,  the  binomial 
being  taken  to  a  higher  power,  and  the  values  o£  the 
constants  a^  and  m  in  col.  (3)  and  of  c-  in  col.  (4)  been 
larger,  the  agreement  of  the  three  curves  Avould  have  been 
correspondingly  closer.  As  it  is,  the  two  first  curves  are 
very  nearly  identical,  while  the  "  normal "  curve,  although 
theoretically  of  unlimited  range,  is  fairly  close  to  the 
binomial,  the  terms  corresponding  to  values  of  x  numerically 
greater  than  4,  amounting  to  less  than  1  in  the  aggregate. 
It  will  be  noticed  that  the  values  of  y  in  the  limited  curves 
necessarily  diminish  more  rapidly  as  the  limiting  values  of  x 
are  approached,  while  the  normal  curve  is  less  flat  in  the 
centre. 

1  +  ^J 0) 

This  curve,  Avhich  is  also  symmetrical  and  unlimited  in 
range,  diverges  from  the  normal  curve  in  a  direction  opposite 
to  Type  1,  the  values  of  y  diminishing,  when  x  is  large,  more 
slowly  than  in  the  normal  curve.     The  curve  transposes  into 

the  latter  (Type  2)  when  a^  and  m  are  indefinitely  large,  ^  =c^ 

being,  however,  finite.     We  then  have 


Lt.  «(l  +  U     =/ce-^=/<==. 

The  average  value  of  x  in  the  curve  y  =  K(a- +  x^) -'^  is 
zero,    corresponding     again    to    the    "mode";    the    second 

moment  =  ii.,  =  t: ^       and      the      "  standard      deviation  " 

^'      2m  —  'S 

a  3a'^  -,    .,    . 

=  — ^=:z= .     The  fourth  moment  =Ui=  ^ u-2  and,  it  is 

y2m-3  2m-o^ 

clear,    becomes   infinite    unless   «i  >  o  •     Indeed,    the    higher 

moments  of  the  curve  must  become  infinite  whatever  be  the 
value  of  m. 

The  classes  of  symmetrical  curves  are  of  somewhat  limited 
application  to  actuarial  statistics,  although  there  are  certain 
cases  in  which  they  represent  the  observations  fairly  well. 

Class  III.  Skew  curves.  Range  limited  in  both  directions. — 
There  is  only  a  single  curve  of  this  class  in  the  family  of 
curves  we  are  considering,  namely  : 

Tyi.e4.  !/  =  <!- ;0    0  +  J W 
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The  values  of  x,  range  from  —a  to   +«;  the  "mode"  is 
for    which    vahie    y    is    a    maximum ;    the 

^        '       "      The  expressions  for  the 


a- 


mo — mi 

at   a;  =  ^H '^ 

nil  +  f^2 

mean  value  of  a;  is  ^ 

mi  +  m2  +  ^ 

moments  of  the  curve  are  simplified  by  putting  it  into  the 
form  given  in  the  table  on  p.  140.  If  we  write  mi  =  np  —  1,  and 
m2  =  nq  —  l  (where  'p  +  q  =  l),  the  equation  to  the  curve 
(which  does  not,  of    course,  change   in  character   with  this 

transposition)  becomes 

^=<-r""(inr" («) 

the  variable  having  the  same  range  of  values  —a  to  +a,  the 

"mode"    being  at  x  = ^(3— p)a;    the   average  value    of 

r^—i^q^__rp^(i.    the    second    moment    =1x2—      ,-i'^>    ^^^   ^he 

"  standard  deviation  "  the  square  root  of  this  quantity. 

When  mi  =  m2  =  7n,  this  Type  evidently  transposes  into 
Type  1,  and  thence  into  Type  2  when  w  is  infinite. 

This  curve  is  related   to   the  skew  point  binomial  arising 

from    the    expansion    of    {p  +  qY',    where    p    and    q    have 

approximately    the    same    values    as    in    equation    (9),    and 

where  the  index  of  the   binomial  is   not  too   small,  there  is 

a  fair  numerical  agreement,  as  may  be  seen  in  the  following 

table,  where  the  figures  given  in  col.  (2)   are  proportional  to 

fl      2Y 
the  terms  in  the  binomial  expansion  of  (  ^  +  -:,)   : — 

Table  VII. 

Showing  Numerical  Similarity  of  the  Curve  of  Type  4  with  the 

Skew   Sinomial. 


Value  of 
Variable 

X 

Binomial  curve 
,_       5760         ,, 

Type  4 
^  =  K(4.-75  -.r)5-5i(6-25  +  a:)"-5« 

^       3+a:|3-.r 

-4 

-3 

-2 

-1 

0 

1 

2 

3 

4 

(a) 

0 

1 

12 

60 

160 

2i0 

192 

64 

0 

(3) 

0 

1 

13 

61 

159 

240 

194 

60 

1 

Totals    ... 

729 

729 

-17 

It  will  be  seen  that  for  so  small  a  value  of  n  as  6  the 
binomial  curve  can  be  closely  represented  by  means  of 
selected  points  in  the  continuous  curve  of  Type  4.  When  the 
value  of  n  is  large,  a  much  closer  agreement  is  obtainable. 

The  skew  binomial  is  of  importance  to  the  actuary  as 
representing  the  law  of  the  deviations  between  the  actual 
number  of  events  observed  in  a  given  series  of  trials  and  the 
"  expected "  number  when  comjDuted  by  the  true  value 
of  the  probabilities.  There  are  very  many  statistical 
distributions  capable  of  being  well  represented  by  the 
binomial  curve  if  the  latter  is  treated  as  a  continuous  curve. 
This  procedure  is  not,  however,  convenient  in  practice,  as  it 
rarely  happens  that  the  given  ordinates  coincide  Avith    the 

integral  values  of  x  in   the    general   term    , — - — (-),  and, 

\x  n—x\qj  ' 


moreover,  the  analysis,  when  the  curve  is  treated  as 
continuous,  is  not  very  simple.      {See  Note  D,  p.  122.) 

The  form  of  curve  corresponding  to  Type  4  varies  very 
considerably  with  certain  changes  in  the  values  of  the 
constants  iiii  and  vu.  In  its  more  usual  form,  Avhen  both 
rtii  and  iiu  are  >1,  as  in  Table  VII,  the  curve  bears  a 
general  resemblance  to  the  age  distribution  of  the 
"  entrants "  in  a  mortality,  or  similar  experience  {see 
Table  II),  also  to  the  numbers  of  the  exposed  to  risk;  to 
the  number  of  marriages,  or  to  the  rate  of  marriage  at 
various  ages ;  to  the  average  number  of  children  under  age, 
or  to  the  cost  of  their  pensions  at  the  death  of  the  father,  a 
function  of  use  in  pension  fund  valuations;  to  the  number 
of  retirements  in  such  funds  where  superannuation  occurs 
on  invalidity  and  not  at  a  specified  age ;  to  the  incidence 
of  attacks,  or  of  mortality,  from  certain  diseases,  &c.  Owing 
to  the  number  of  constants  involved  (as  the  increment  of  x 
may  represent  any  period  of  time,  there  are  virtually  five), 
the  curve  is  very  adaptable. 

It  will  be  readily  seen  that  if  the  values  of  both  m^  and 
?%  in  equation  (8)  are  high  the  curve  makes  very  close 
contact  with  the  axis  of  x  at  cither  limit;  if  Wi  or  ?7i2  lies 
between  0  and  1,  the  curve  meets  the  axis  of  x  at  an  angle ; 
whereas,  if  either  or  both  of  them  are  negative,  the  expression 
Ixjcomes  infinite  at  one  or  both  limits.  The  area  of  the  curve 
and  the  moments  do  not,  however,  become  infinite  if  both  ?;i, 
and  in2  are  greater  than  —1. 
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Class  IV.  Skew  curves.  Range  limited  in  one  direction. — 
There  are  two  curves  of  this  class. 

Tyiie  5.  ;?/  =  /<;a;"»e-*/» (10) 

which  is  a  limiting  form    of    curve  No.  4,  the  values  of  x 
ranging  from  0  and  oo  . 

The  "mode"  is  at  x  =  'ma;  the  mean  value  of  x  is 
(m  +  l)a;  the  second  moment  (m  +  l)a^;  and  the  third 
moment  2(m+l)a^;  these  being  sufficient  to  determine  the 
constants. 

In  the  usual  form  of  the  curve,  that  is  when  m>l,  this 
curve  represents  fairly  well  some  of  the  statistical  distributions 
represented  by  curve  No.  4.  Owing  to  the  feature  that  as  x 
becomes  large  the  successive  terms  have  a  tendency  to  run 
into  a  geometrical  progression,  it  is  not  so  well  suited  to  such 
distributions  as  that  of  the  "  exposed  to  risk  ",  where  the  effect 
of  the  rapid  rise  in  the  rate  of  mortality  at  the  older  ages 
makes  itself  felt  in  an  increasingly  rapid  diminution  in  the 
values  of  y.  This  is  somewhat  unfortunate,  as  the  curve  is  a 
simple  one,  determined  by  the  values  of  its  first  three 
moments,  and  except  for  the  reason  stated,  well  suited  for  use 
in  connection  with  Makeham's  formula  for  the  force  of 
mortality. 

As  in  Type  4,  the  character  of  this  curve  may  be  entirely 
chansred  bv  an  alteration  in  the  values  of  the  constant  m.  If 
this  constant  vanishes  the  curve  becomes  a  dimmishmg 
geometrical  progression ;  while  for  negative  values  of  m  the 
curve  becomes  infinite  at  the  lower  limiting  value  of  x.  The 
value  of  m  must  in  any  case  >  —  1. 

The  actuary  has  to  deal  with  several  distributions  roughly 
similar  to  a  diminishing  geometrical  progression  as,  for 
example,  the  curve  of  infant  mortality,  the  rate  of  withdrawal 
in  successive  policy  years,  or  the  difference  between  the  select 
and  ultimate  mortality  rates  in  a  select  mortality  table.  Other 
expressions  giving  a  similar  form  of  curve  may  be  employed  to 
represent  these  distributions  as,  for  example,  y  —  K  (a +  6-"'^), 
with  a  minimum  value  of  /ca  when  x  is  very  large;  or 
y  =  K{x  +  a)~^,  where  if  a  is  small  we  have  a  curve  again 
similar  to  that  of  infant  mortality,  x  representing  the  age. 

Types.     ,  =  «(?-l)""(|+l)""' (11) 

where    the    limiting    values    of    x  are    a   and   oo ,    with    an 


a"v 
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,^erao:e    value   of   x=  pr-^j    the  '' mode "  occurring 

^  ??i2  — mj  — 2  ° 

at  a;=:  — ^a.     The  expressions  for  the  moments  are  much 

7712  — 771 1 

simplified  by  writing  the  equation  to  the  curve  in  the  form 
given  in  the  Table  on  pp.  140-1. 

Type  7.  7/  =  A;a;-'»e-"/^ (12) 

Where  x  varies  between  0  and  oo ,  having  an  average  value  of 
,  with    the   "mode"   at   x=  —  .     The    second   moment 


m  —  2'  m 


0.2 


/i2=  ; r^, — :^        and       the       "standard       deviation" 

consequently  = 


{m-2)  y/m-'6 

Here  m  must  be  >  3,  or  the  second  moment  becomes  oo , 
and  the  fourth  moment  becomes  infinite  unless  m  is  greater 
than  5. 

Neither  this  nor  the  preceding  curve  are  of  any  wide 
application  in  actuarial  statistics,  owing  to  the  fact  that  the 
values  of  y  for  large  values  of  x  diminish  with  increasing 
slowness ;  a  feature  not  often  met  with  in  practice  except  in 
such  a  function  as  the  "  rate  of  withdrawal."  The  same 
remark  holds  good  of  the  single  curve  constituting  Class  V. 

Class  V.  Shew  curves.    Range  unlimited  in  either  direction. 


2\  -m 


w 


This    is    the    only    skew    curve    of    this    family    having 
unlimited  range.     The  average  value  of  x— cy, y.  a;  the 


"  mode  "  is  at  ,t — a. 
2m 


The  expressions  for  the  moments  and  their  i'lmctions  are 

simplified  Ijy  writing  (+ Ij  for  m   in   equation   (13),   as  in 

the  Table  on  pp.  140-1.    For  the  reason  stated  above,  the  curve 
is  not  specially  useful  to  Actuaries. 

E 
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Assuming  tliat  a  given  statistical  series  can  be  represented 
by  one  or  other  of  the  curves  above  described,  the  appropriate 
curve  can  be  found  by  means  of  certain  criteria  based  upon 
an  examination  of  the  "  moments  "  of  the  curve ;  that  is  to 
say,  the  sums  of  the  powers  of  the  deviations  from  the  mean 
value.  These  criteria  are  furnished  by  the  table  on  pp.  140-1, 
above  referred  to. 

As  the  calculation  of  the  criterion  is  somewhat  lengthy, 
it  may  be  noted  that  if  the  logarithms  of  y  are  tabulated 
for  equal  intervals  of  the  variable  x,  and  the  values  of 
A-  log  y  taken  out,  these  give  us  information  as  to  the 
nature  of  the  curve.  The  value  of  AMogy  will  be 
constant  and  negative  for  the  "  normal "  curve  Type  2  ; 
negative  and  symmetrical  with  a  minimum  numerical  value 
in  the  centre  of  the  range,  for  Type  1,  or  for  any  binomial 
curve;  uniformly  negative,  non-symmetrical,  and  with  a 
numerical  minimum  in  the  case  of  Type  4  (where  this 
curve  vanishes  at  the  limits)  ;  and  uniformly  negative  and 
continuously  decreasing  towards  the  upper  limit  of  x  in  the 
case  of  Type  5,  where  this  curve  vanishes  at  the  limits. 

In  the  case,  therefore,  of  those  curves  most  useful  to  the 
Actuary  the  function  A^  logy,  computed  for  the  ungraduated 
curve,  enables  us  to  select  generally  the  formula  most  suited 
to  the  series.  For  this  purpose  if  the  data  are  grouped  it 
will  generally  be  better  to  compute  the  approximate  values 
f)f  the  central  ordinates  of  each  group  by  an  interpolation 
formula,  such  as  that  given  on  p.  57. 

Other  types  of  curves  will  sometimes  be  found  useful 
besides  those  arising  from  the  differential  equation  on  p.  39 ; 
but  they  do  not  generally  lend  themselves  so  readily  to  the 
method  of  moments. 

If,  for  example,  we  write 

/   m          n   \ 
=  /c-e~Va+^     t'+=^'' (14) 

we  obtain,  when  m  and  n  are  numerically  unequal,  a  skew 
curve  vanishing  when  x=—a  or  —b.  We  may  deal  with  this 
curve  in  practice  by  determining  the  values  of  equidistant 
ordinates  as  shown  on  pp.  57-8.     Thus 

logw  =  /c — -T— —  =u;  ....      (lo 

°  ^  a  +  x      b  +  x 


y 
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As  log  y  becomes   —  x    at  tlie  limits^  we  multiply  both  sides 
by  {a  +  x){h  +  x),  thence 

w  \_ah  +  (a  +  l)x  +  x"-'] 

=  K  \_ah  +  (a  +  'b)x  +  x-']  —  m{h  +  x)  —n (a  +  x) 

=  A  +  Bx+Cx'~  {sRj) (16) 

where  the  unknowns  are  a,  b,  A,  B  and  C. 

If  we  difference  three  times  the  right  hand  side  vanishes 
and  we  have  a  series  of  expressions  involving  {ab)  and  [a  +  h) 
equated  to  zero  and  by  suitably  grouping  these,  or  by  using 
the  method  of  moments  a  and  h,  and  thence  the  remaining 
constants,  may  be  evaluated. 

A  similar  process  may  be  employed  with  advantage  with  a 
curve  such  as  the  usual  form  of  exposed  to  risk  or  died,  when 
the  data  are  in  large  age  groups.  We  inay  then  take  w  in 
equation  (15)  to  represent  the  common  log  of  the  ratio  of  the 
numbers  above  age  x  to  the  numbers  below  age  x  in  the  series. 
That  is,  if  the  total  number  in  the  series  =N,  the  number 
above  age  a;  =Y,  we  may  Avrite 


\N-YJ  a  +  x      b  +  x 


o 


In  many  cases  the  constant  K'  may  be  omitted  if  the 
number  of  groups  is  small;  in  this  case  C  in  equation  (16) 
becomes  zero.  On  the  other  hand  it  may  sometimes  be  found 
necessary  to  add  a  term  to  the  right  hand  of  equation  (16) 
involving  x^. 


I'.      w 


FOURTH    LECTURE. 


W  E  shall  now  consider  very  shortly  the  problem  of  fitting 
frequency  curves  to  statistical  data.  To  do  this  at  length 
would  be  impossible  in  the  time  at  our  disposal^  and  the 
student  who  wishes  to  pursue  the  subject  in  detail  may  read  the 
original  papers,  already  referred  to  (p.  39),  of  Professor  Karl 
Pearson,  to  whom  the  development  of  the  subject  is  due, 
or  Mr.  Elderton's  book.  There  are  certain  general  principles 
however,  which  may  be  usefully  considered.  The  method 
usually  employed  in  fitting  these  curves  is  by  making  the 
moments  of  the  graduated  equal  to  those  of  the  ungraduated 
curve,  which  is  equivalent  to  making  the  quantities 
S  (deviations),  t"^  (deviations),  &c.,  as  far  as  S"*  or  X^  equal 
to  zero.  This  method  may  not  always  be  the  most  convenient 
or  the  best  for  the  purpose  of  the  Actuary,  but  it  is  so 
for  most  statistical  purposes,  and  has  come  much  into  use 
accordingly. 

We  have  already  seen  that,  in  the  case  of  the  curves 
arising  from  the  differential  equation  on  p.  39,  expressions 
for  the  moments  may  be  obtained  in  terms  of  the  constants 
which  will  enable  us  to  determine  the  value  of  the  constants, 
when  the  numerical  value  of  the  moments  is  known.  For  the 
purpose  of  fitting  the  appropriate  curve  to  any  given  series 
of  observations  it  is  only  necessary  to  determine  the  value 
of  the  moments  as  given  by  the  observations,  that  is,  the 
value  of  the  sum  of  the  squares,  cubes,  &c.,  of  the  deviations 
from  the  mean  value  of  the  variable. 

It  will  be  useful  to  consider  shortly  the  calculation  of  the 
numerical  value  of  the  moments  in  a  given  instance.  Take 
first  the  simplest  possible  case  where  we  have  to  do  not  with 
a  continuous  curve,  but  Avith  a  series  of  points  representing 
isolated  ordinates,  where  in  consequence  we  replace  Integra- 


53 


tions  by  summations.  In  the  following-  table,  the  first  column 
contains  the  values  of  the  independent  variable  x,  the  range 
of  values  being  from  0  to  6.  The  second  column  contains 
the  values  of  its  function  y,  which  are  proportionate  to  the 

successive  terms  in  the  expansion  of  the  binomial  f     -f  ^  j  , 

the  constant  multiplier  729  being  introduced  merely  to  avoid 
fractions.  The  remaining  columns,  in  which  the  average 
value  of  X  and  the  values  of  the  successive  moments  are 
worked  out,  explain  themselves.  It  may  be  remarked  that  in 
this  example  the  average  value  of  x,  and  the  deviations  from 
the  average,  are  all  integral,  and  it  is  therefore  convenient 
to  calculate  at  once  the  moments  round  the  average  value 
("  centroid  vertical").  In  most  cases,  however,  the  average 
and  the  deviations  will  not  be  integral,  and  then  it  will 
be  more  convenient  to  calculate  the  moments  round  the 
origin  or  some  selected  middle  value  of  the  variable, 
afterwards  transferring  the  moments  to  the  mean  by  the 
formulae  given  on  p.  41. 

Table   YIII. 
Moments  of  the  Point  Binomial  Curve. 

729.      -      (-)   (^)       =  .(2Y. 


X 

y 

xy 

{x-^)y 

{x-^fy 

(^-4)V 

(.r-4)V/ 

0 

1 

0 

-     4 

16 

-  64 

256 

1 

12 

12 

-  36 

108 

-324 

972 

2 

60 

120 

-120 

240 

-480 

960 

3 

160 

480 

-160 

160 

-190 

160 

4 

240 

960 

0 

0 

0 

() 

5 

192 

960 

192 

192 

+  192 

192 

6 

64 

384 

128 

256 

+  512 

1,024 

Totals 

Totals 

729 

1 

2,916 
4 

0 

072 

-324 

3,564 

0 

4 

4 

41, 

mean  vnlne 

3 

9 

•J 

+  729 

. 

< )  1  X 

=/J. 

=  M.' 

=  M.-. 

=  /*♦ 

Obviously,  when  llio  iiioiueiits  arc  calcuhited  al)(»ut  the  moan 
the  first  moment  is  zero  (because  it  represents  the  average 
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deviation  from  the  average  value).  The  even  moments  are 
always  positive^  because  each  term  is  of  the  form  yx^^^'', 
i.e.,  essentially  positive ;  and  if  the  curve  is  symmetrical  the 
odd  moments  vanish,  because  each  term  of  the  form  y^x^^"^^  is 
cancelled  by  a  term  (equidistant  from  the  mean)  of  the 
form  y-x{  —  a;)^"'*''.  In  general,  where  the  curve  is  not 
symmetrical,  the  third,  fifth,  &c.,  moments  will  not  be  zero. 

In  the  above  illustration,  we  have  considered  x  to  have 
integral  values  only.  This  may  be  said  to  approximate  to 
the  conditions  of  many  statistical  tables  used  by  the  Actuary 
where  x  represents  the  year  of  age  under  observation,  and 
where  it  is  indifferent  whether  the  observations  are  supposed 
to  be  spread  over  the  year  in  the  form  of  a  continuous  curve, 
or  whether  we  consider  them  all  to  have  reference  to  the 
central  point  of  the  year.  In  these  cases,  however,  x  will 
generally  have  a  large  range  of  values,  amounting  possibly 
to  60  or  80,  and  the  labour  of  computing  the  numerical 
value  of  the  moments  is  then  much  lessened  by  grouping 
the  facts  in  larger  sections,  though  we  cannot  then  safely 
assume  the  totals  of  each  group  to  be  concentrated  at  the 
middle  ordinate. 

Take  the  set  of  observations  in  Table  IX  representing 
for  decennial  age   groups  numbers  exposed   to  risk    in   the 

middle  of  each  year  of  age,  i.e.,  ^x=^x—  q^x,  in  the  recent 

mortality  experience  of  lives  assured  by  ascending  premium 
policies,"^  excluding  the  first  ten  years  from  entry.  Here  we 
have  no  longer  the  values  of  equidistant  ordinates  of  the 
curve,  but  the  area  of  the  curve  enclosed  between  successive 
ordinates.  To  obtain  the  moments  of  the  curve  with  any 
degree  of  accuracy,  we  cannot  treat  these  areas  as 
proportional  to  their  central  ordinate. 

It  will  be  noticed  that  the  particular  curve  we  are  dealing 
with  becomes  gradually  zero  at  either  extremity,t  and  we  may 
assume,  without  serious  error,  that  it  makes  "close  contact" 
at  either  end  with  the  axis  of  x,  that  is  to  say,  is 
asymptotic  thereto.  In  these  cases,  Mr.  Sheppard  has  shoAvn  J 
that  very  approximate  values  for  the  moments  may  be  found 

*  See  Unadjusted  Data,  Minor  Classes  of  Assurances,  p.  191. 

+  We  omit  the  numbers  at  risk  under  age  25  (arising  from  entrants  under 
age  15),  amounting  to  only  25  in  all. 

J  An  elementary  demonstration  is  given  in  Elderton's  Treatise,  p.  28-29. 
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by  treating  the  area  of  each  successive  section  of  the  curve 
as  concentrated  in  the  middle  ordinate  of  the  section  j  in 
other  wordsj  treating  the  values  of  y  as  representing  isolated 
ordinates  exactly  as  Avas  done  in  Table  VIII ;  and  then 
applying  to  the  values  of  the  moments  so  found  (denoted  by 
the  symbol  m')  the  following  adjustments  leading  to  the 
corrected  moments  denoted  by  the  symbol  m  : — 

,        1 
W2=m2— j2 

,       I     ,        ,       \ 

7??3=i?i  3—  -m  =m  3 — 7  mi 
4  4 

,1,^7  ,1  1 

For  moments  round  the  centroid  vertical  these  become^ 
remembering  that  /ii  =  0, 

,        1 


/*2  =  /i2  — 


/^  =  /ti3 


12 


A<-.=/i4— :;/^i 


80* 


Table  IX. 

Ascending  Premium  Assurances — Experience  1863-1893. 

Duration  10  years  and  upwards. 
Calculation  of  Moments  of  ^'' Exposed  to  Hisk^'   Curve. 


ExpoHed  to 

Ages 

Kisk 

X 

Xl/ 

X't/ 

x'^t/ 

x\i/ 

25-35 

2,874 

-2 

-   5,748 

11,496 

-22,992 

45,984 

35-45 

22,020 

-1 

-22,020 

22,020 

-22,020 

22,020 

45-55 

26,164 

0 

55-65 

17,391 

1 

17,391 

17,391 

17,391 

17,391 

65-75 

7.845 

2 

15,(;90 

31,380 

62,760 

125,520 

75-85 

1,761 

3 

5,283 

15,849 

47,547 

142,641 

85-95 

81 

4 

324 

1,296 

5,184 

20,736 

Totals 

78,136 

10,920 

99,432 

87,870 

374,292 

Uoducpd 

1 

•13976 

1-2725 

11246 

4-7903 

to  unit  area 

=  ,r/, 

=  ««', 

^m'. 

=  m\ 
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From  these  results  we  obtain  by  means  of  the  corrections 
above  stated — 

m,  =  -13976;  m2=  1-1892;  m3=  1-0897;  w,  =  4-1832. 

Whence,  by  the  equations  on  p.  41. 

/i2=  1-1697;  /^3=-5965;  /X4=3-7122. 

If  quinquennial  age  groups  had  been  used,  making  due 
allowance  for  the  unit  of  time  still  being  taken  as  ten  years, 
the  corresponding  values  would  have  been 

TO,  =  -13848;  /i2=l-1741;  ya3  =  -5869;  yL64=3-7160. 

using  these  latter  values,  as  the  more  accurate,  we  obtain  for 
the  values  of  the  functions  /5i ,  ^-2  and  7. 

A=/^V/^'2=-21283;  /32=/.V/^\=2-6957;  7=  |i±|  =-7397  ; 

2 
As  /X3  does  not  vanish,  and  7  is  >  .3,  we  see  from  the  table 

on  pp.  140-1,  that  if  the  series  can  be  represented  by  any  of 
the  curves  there  given,  it  must  be  by  No.  4,  excluding  the  skew 
binomial  as  unsuitable  for  reasons  already  given.  It  is  also 
obvious  from  the  run  of  the  figures  in  Table  IX,  that  the 
curve  is  limited  in  both  directions.  Equating  the  expressions 
in  Table  IX  with  the  above  numerical  values,  we  have 

7=1  •''^=•7397;  whence  71  =  7-13 
6  n  +  2 

^-'^(^]  +  ^\(r-9r  =-21283; 

whence  (p — qY='b4<6opq 

(j)  +  g)'-=:4-5453jjg'  =1  (since  ^  +  2  =  1) 

(^-2^=V4^3="^^^^ 
giving  ^  =  -6732;    g  =  -3268 

fM2=^^-a^=V17U;  whence  a  =  3-293 
11+1 

thus  giving  a  range  of  32-93  years  on  either  side  of  the  age 
for  which  the  value  of  x  in  the  formula  =  0.  This  has  nothing 
to  do  with  the  zero  point  (age  50)  in  Table  IX.  The  mean 
age  as  is  seen  from  that  table  is  50 +  1-385  =  51 '385.  The 
value  of  m,  the  mean  as  computed  by  the  above  formula,  is 

Wi  =  (<j'— p)a=  — 1'1407 
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that  is,  11-407  years  earlier  than  the  central  point  of  the  range, 
giving  for  the  latter,  51-385+  11-407  =  62-79,  say.  The  range 
of  the  curve  is  therefore  from  age  29-86  to  age  95-72  ;  and, 
computing  the  values  of  np—1  and  iiq  —  l,  ^ve  have,  for  the 
final  form  of  the  equation  of  the  curve,  when  .c  =  the  age  : 

It  is  often  a  convenience,  however,  to  have  the  values 
of  the  central  ordinates  of  the  groups,  which  may  be 
approximately  obtained  by  interpolation.  If  the  numbers  m 
any  group  are  represented  by  the  symbol  Ux,  the  number  of 
years  in  each  group  being  t,  the  value  of  the  central  ordinate 
of  the  group  (that  is  to  say,  the  numbers  under  observation 
exactly  at  the  central  age  of  the  group)  will  be  approximately 

V^^— ^  "^-'V.  As,  however,  it  is  convenient  to  treat  the 
interval  t  as  the  unit,  for  the  time  being,  we  may  write  as  the 

values    of  the    central    ordinates   w^- 2^^    ^^^^^    original 

numbers  for  each  group  less  ^th  of  their  respective  central 
second  differences).  In  the  class  of  curves  we  are  discussing, 
namely,  those  having  close  contact  at  both  ends  with  the  axis 
of  x,  the  numerical  values  of  the  moments  as  deduced  from 
these  ordinates  will  be  very  nearly  the  values  for  the 
continuous    curve,    unless    the    number   of    groups    is   very 

small.      Thus    the    values    of       ydx,    and    of    the    functions 

f  xydx,      x'^ydx,  will  be    found    by   taking  the   sum   of   the 

ordinates  of  y,  computed  as  above,  and  the  sum  of  the 
products  xy,  x'^y. 

An  advantage  attaching  to  the  use  of  ordinates  in  lieu  of 
areas  is  that,  in  the  class  of  curves  we  are  dealing  with,  we 
can,  by  examination  of  the  differences  of  the  logarithms  of 
the  ordinates,  gain  a  better  idea  of  the  nature  of  the  curve 
than  can  be  obtained  from  the  grouped  figures.  [See  Third 
Lecture,  p.  50.)  It  is  also  easier  to  compare  the  graduated 
fif-ures  as  given  by  the  frequency  cui-ve  by  means  of  isolated 
ordinates  than  by  means  of  groups  or  areas. 

•  The  formula  to    Uli   ilifTcreiiccs    is  v^ "^  +  -  .^[.^  "'  nearly,  iuiil    in 

order  that  the  resultinp  4th  nioincnt  slioiild  a;,'rce  exactly  with  that  ohtainctl 
from  the  use  of  the  t'TOoi'^-'l  fijrnrrB,  or  areas,  with  Shcpiianl's  corrections,  the 
4tl»  diflcreuce  is  re(|uirt(l,  but  for  pniclital  iiurjioscs  it  is  not  often  needed. 
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The  use  of  the  central  ordinates  of  the  groups  has 
the  incidental  advantage,  which  is  very  considerable 
in  the  case  of  a  mortality  or  similar  experience,  of 
giving  trustworthy  values  of  the  force  of  mortality,  or 
corresponding  function,  for  the  ages  corresponding  to  the 
position  of  the  ordinates.  In  the  usual  plan  of  summarizing 
a  mortality  table  by  giving  the  numbers  at  risk  and  deaths 
in  consecutive  age  groups,  the  ratio  of  the  deaths  to  the 
numbers  at  risk  in  each  group  is  not  a  useful  function,  as  it 
does  not  correctly  represent  the  mortality  for  the  central  age, 
except  near  the  middle  of  the  table,  where  the  numbers  under 
observation  in  successive  years  is  nearly  constant. 

We  may  apply  this  method  to  the  example  already  dealt 
with  on  p.  55,  viz.,  the  experience  of  ascending  premium 
policies.  The  calculations  as  set  out  in  the  following  tabular 
form  are  sufficiently  clear : 


Table  X. 

Mortality  experience  of  lives  assured  ly  ascending  Premiums, 
1863-1893.     Duration  10  years  and  upwards. 


^ 

•Estimated  Central 

Central 

Ordi 

NATES 

Ages 

age 

of  group 

{X) 

Exposed  to 
Kisk 

Died 

Central  Age 

Exposed 
to  Risk 

Died 

(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

25-30 

27-5 

266 

2 

168 

•8 

•0048 

30-35 

32-5 

2,607-5 

31 

2,448 

29-2 

-0119 

35-40 

37-5 

8,788 

102 

8,860 

102-0 

•0115 

40-45 

42-5 

13,232-5 

173 

13,389 

175-2 

•0131 

45-50 

47-5 

13,910 

192 

14,007 

191-7 

•0137 

50-55 

52-5 

12,254 

218 

12,284 

218-6 

•0178 

55-60 

57-5 

9,878-5 

229 

9,878 

228-4 

•0232 

60-65 

62-5 

7,512-5 

255 

7,518 

255-4 

•0340 

65-70 

67-5 

5,007-5 

271 

4,994 

274-4 

•0549 

■    70-75 

72-5 

2,837 

206 

2,809 

205-6 

•0732 

75-80 

77-5 

1,347-5 

151 

1,324 

151-4 

•1144 

80-85 

82-5 

413-0 

85 

389 

84-8 

•2180 

85-90 

87-5 

77 

24 

66 

22-3 

•3379 

90-95 

.  92-5 

4-5 

3 

2 

2-2 

1^1000 

Totals... 

... 

78,136 

1,942 

78,136 

1,9420 

... 

*  Taking  5  years  as  the  unit,  computing  by  formula  Ux  — 


24     ' 


where 


Ux  represents  the  number  in  columns  (3)  and  (4).  By  this  formula  there  are 
—  11  persons  exposed  to  risk  at  age  22^5;  these  have  been  included  in  the 
group  25-30. 
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If  the  values  of  the  moments  are  computed  from  columns 
exactly  as  was  done  with  the  Binomial  Curve  (Table  VIII,  p.  53) 
they  will  be  found  to  be  pi-actically  identical  with  those  found 
above.  The  estimated  values  of  /u.^  for  the  central  ages  of  the 
group  are  inserted  as  they  will  be  used  later. 


In  many  cases  the  principle  of  the  method  of  moments 
may  be  used  to  fit  a  curve  to  a  series  of  observations  without 
actually  computing  the  numerical  values  of  the  moments 
themselves,  using  instead  the  successive  summations  of  the 
ordinates,  or  areas,  from  which  the  moments  can  be  readily 
obtained  if  required.  This  method  is  also  useful  if  one  or 
both  limits  to  the  range  of  the  curve  can  be  assumed. 

Consider  a  scheme  such  as  the  following,  in  which,  with  a 
view  to  clearness,  we  use  actual  numbers  of  the  series,  given 
on  p.  53,  instead  of  symbols  : — 


X 

Vx 

2«2 

2^-«x 

23«;. 

2^/<x 

:s,'u^ 

UxXX^ 

0 

1 

729 

... 

... 

0 

1 

2 
3 

12 
60 

160 

728 
716 
656 

2,916 
2,188 
1,472 

7.776 
(6,318) 

4,860 

2,672 

9,180 
4,320 

15,660 
(11,070) 

6,480 

12 

960 

12,960 

4 

210 

496 

816 

1,200 

1,648 

2,160 

61,440 

5 

192 

256 

320 

384 

448 

512 

120,000 

6 

64 

64 

64 

64 

64 

64 

82,94 1 

278,316 

In  this  scheme,  each  column  is  formed  from  the  preceding 
by  successive  addition  from  the  bottom,  in  the  same  way 
that  the  M^  column  is  formed  from  C, ,  and  Rj.  from  Mx- 

If  we  take  the  value  against  x  =  0  in  the  column  'Zux,  say 
luo,  we  see  that  each  value  of  tij;  occurs  once  only  in  that 
total.  In  the  total  appearing  against  x=l  in  the  second 
summation,  say  S^u,,  eacli  value  of  t(j.  occurs  x  times; 
siiuihirly    the    total    against  x  =  2   in  the  column  S^Ux,    say 

thoi,  represents  the  sum  of  the  products  ^ — '^^''    ''"'^  *^'" 
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total  against  x  =  S  in  the   column  %hije,  say  Xhi^,  represents 

the  total  of  the  products  ~ |f~    ^tx,    and    so    on,    the 

coefficients  following-  the  Binomial  law.  It  is  evident  from 
this  that  the  sums  of  the  products  cc^Ux,  xhi^.,  &c.,  are 
implicitly  contained  in  these  totals ;  and  that  if  these  sums 
of  the  graduated  and  ungraduated  values  are  in  agreement, 
the  moments  of  the  two  curves  will  also  agree.  Writing  m„ 
as  the  value  of  the  nth  moment  round  the  ordinate  of  x  =  0, 
we  shall  find  :* 


mi  =  ~. — 


7n.2=  - 


7ni  = 


Xuo 


1714  = =^ ■ 

Zuo 

These  formulje  may  be  simplified  if  we  write  them  in  a 
form  analogous  to  central  difference  formulae — writing,  for 
example  : 


these    average    values    being    shown  in  antique  type  in  the 
Scheme.     We  then  have,  omitting  the  common  divisor  Xuq  : 

m2=2S^UiL 

nii  =  24tXhi2^^  +  Wo 

The  equivalence  of  the  above  foi-mulfe  may  be  illustrated  by 
the  following  numerical  examples  based  on  the  above  scheme. 

*  See  the  demonstration  in  Note  E,  p.  124. 
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Using  N  as  an  abbreviation  of  3!;7o  =  tlie  total  number  of 
observations,  we  have 

N.m,=     2916  =  S2m, 

N.m2=   12636  =  22:3^2  +%~u,     =2x4860  +  2916 
=  2thiri  =2x6318 

N.'m3=  57996  =  6S^M3   +6^^.,   +S%i  =  6  x  4320+ 6  x  4860  +  2916 
=  6S'u.2  +^2,^^  =  6x9180  +2916 

N.7n4=278316  =  24S5it4  +  36S'i(3+14^3^^,  +  S2ji,=:24x2160 

+  36  X  4320  + 14  X  4860  +  291 6 

=  24>tHi^i  +2S^x  =24x11070  +  2x6318 

The  last  may  be  compared  with  the  direct  calculation  of 
x*Uj:  given  in  the  last  column  of  the  scheme.  The  values  of 
the  moments  through  the  centroid  vertical  may  be  obtained  if 
required  by  the  formulas  : 

IJUi=m.2—{miY 

/i3  =  Wis  —  3  (wi)/i2  —  (wi)  3 

Ai4  =  »l4  — 4(7?li)yti3  — 6(??li)^/i2— (?/ii)^ 

Where  the  number  of  terms  in  the  series  is  few,  there  is  no 
special  advantage  in  this  method ;  but  if  the  number  of  terms 
is  considerable  it  effects  a  saving  of  time,  more  particularly 
if  the  calculation  of  the  moments  round  the  centroid  vertical 
is  not  needed  by  the  conditions  of  the  problem,  as  in  the  case 
of  the  graduation  of  rates  of  mortality  by  Makeham^s  or 
any  similar  frequency  formula. 

The  case  of  curves  not  making  close  contact  with  the  axis 
of  X  at  both  ends  requires  to  be  considered  separately,  but  tlio 
results  obtained  are  not  altogether  satisfactory,  see  Elderton, 
pages  29-30.  TIk;  difficulty  can,  however,  to  a  great  extent 
be  avoided  in  most  cases  arising  in  actuarial  work  by  using 
very  small  groups,  or  even  individual  values  for  each  year  of 
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age,  &c.,  in  calculating  the  moments.  The  labour  although 
increased  is  by  no  means  prohibitive  if  the  summation  method, 
above  described,  be  adopted. 

Professor  Karl  Pearson  has  shown'^  that  the  method  of 
fitting  a  curve  by  computing  its  moments  should  lead  to 
nearly  the  same  results  as  the  method  of  least  squares.  If  we 
are  fitting  to  a  given  sot  of  observations  an  ordinary  parabolic 
curve,  represented  by  the  equation  7/  =  a  +  6a;  +  caj^  +  &c.,  then 
the  method  of  moments  and  the  method  of  least  squares  are 
identical. t  He  infers  from  this  fact  that,  even  if  y  is 
represented  by  a  more  complex  expression,  the  numerical 
results  from  the  method  will  be  nearly  the  same  as  with  the 
method  of  least  squares.  It  would  appear  at  first  sight  that 
the  effect  of  the  method  of  moments  is  to  give  equal  weight 
to  each  observation  or  group  of  observations,  in  spite  of  their 
having  unequal  average  errors ;  whereas  the  method  of 
least  squares  shovild,  strictly  speaking,  be  applied  only  when 
the  average  error  of  each  observation  is  nearly  equal. J  In  a 
mortality  table,  where  the  number  of  persons  under  observation 
and  the  number  of  deaths  are  relatively  large  in  the  middle 
of  the  table  and  fall  off  to  zero  at  the  beginning  and  end,  the 
probability  of  a  given  error  in  the  value  of  g  is  very  much 
■smaller  at  the  central  ages ;  while,  on  the  other  hand,  the 
probability  of  a  deviation  of  a  unit  in  the  number  of  deaths  is 
■correspondingly  greater.  The  same  applies  to  most  tables  of 
statistics,  as  they  usually  present  a  series  starting  from  zero, 
rising  to  a  maximum,  and  diminishing  to  zero  again,  the 
weight  of  the  observations  being  in  the  middle  of  the  curve, 
where,  however,  the  probability  of  a  given  numerical  deviation 
in  the  actual  numbers  is  also  greater. 

We  have  seen  that  in  a  series  of  numbers  representing  the 
distribution  of  a  group  into  sub-groups  the  average  error  in  any 

VfYVi  (72.  —  771 1 
— ^ ~ ,    where    n   is  the 

number  in  the  group  and  m  the  (graduated)  number  in  the  sub- 
group.    If,  as  is  generally  the  case,  n  is  large  compared  to  m, 

*  Biometrika,  vol.  i,  p.  266-271. 

t  This  assumes  that  the  unadjusted  moments  {m  not  m')  are  used,  i.e.,  that 
the  niunbers  represent  ordinates  and  not  areas.  If  the  moments  are  assumed  to 
represent  areas  and  the  corresponding  corrections  are  introduced,  the  method  of 
moments  no  longer  gives  precisely  the  same  results  as  the  method  of  least 
•squares  :  see  examples  given  by  Todhunter,  J. I.  A.,  xli,  444. 

I  See  Note  C,  p.  117. 
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this  expression  may  be  taken  as  equal  to  "8  v^w,  the  average 
error  in  the  ratio  ~~  being  approximately  '8 Thus,  if 

the  number  at  risk  at  a  given  age  equals  n  and  the  true 
probabilities  of  death  and  survivorship,  are  q  and  p,  then 
•8  '/njiq'^  (which  as  p  is  nearly  unity  for  the  greater  number 
of  ages  may  be  roughly  taken  as  •SvWmber  of  deaths), 
is  an  approximate  expression  for  the  average  deviation  from 
the  expected  number  of  deaths.  The  method  of  moments, 
if  employed  to  represent  a  given  series  by  a  parabolic  curve, 
assumes  an  equal  probability  of  unit  error  in  each  term  of 
the  series.  If,  thex-efore,  the  series  is  of  such  a  character 
that  the  extreme  values  are  relatively  small,  these  parts  of 
the  data  will  have  somewhat  less  than  their  due  weight  in  the 
fitting  process.  If,  hoAvever,  the  formula  to  be  fitted  does 
not  represent  a  parabolic  curve,  but  a  curve  analogous  to  the 
normal  curve  he~^'  >'^',  say  a  curve  of  the  form  e«+6a;+ei'+&c. 
then  it  will  be  found  that,  on  the  assumption  that  the  moan 
error  in  any  value  y  is  equal  to  v/?/i  (wjiero  i/,  represents  the 
graduated  value  of  y)  the  metliod  of  moments  gives  the  same 
result  as  the  method  of  least  squares  when  the  observations 
are  duly  weighted  (see  Note  F,  p.  129). 


We  come  now  to  the  class  of  curves  representing  not 
the  actual  numbers  in  statistical  tables,  but  the  ratios  of  the 
corresponding  numbers  in  the  double  series,  such  as  those  of 
tables  of  "  Exposed  to  Risk  "  and  "  Died  ",  curves,  that  is, 
representing  such  functions  as  rates  of  mortality,  of  marriage, 
of  lapse,  of  superannuation,  &c.  The  most  interesting  and 
important  of  these  is  the  curve  due  to  Makeham's  development 
of  Gompertz's  hypothesis,  in  which  the  force  of  mortality  at  a 
given  age  x  is  represented  by  the  expression 

/ii,^  =  A  +  13c^ 
leading  to  the  equation 

log,o/x=K  +  A'.«  +  B'c^. 

This  curve  lias  a  double  value  as,  apart  from  its  use  in 
graduating  a  mortality   table,  it  lias  the  valuable  property 

•  See  Note  A,  p.  110;  .J.I.A.,  xxvii,  214. 
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that  the  values  of  annuities  on  n  joint  lives  of  various  ages 
can  be  found  from  a  table  of  single  entry  showing  the  values 
of  annuities  on  n  lives  of  equal  age.  Owing  to  its  importance 
it  will  be  useful  to  give  some  attention  to  the  problem  of 
fitting  this  curve  to  a  mortality  experience.  We  will  first 
consider  the  case  of  an  aggregate  or  non-select  table,  that  is, 
a  table  in  which  the  rate  of  mortality  is  a  function  of  the  age 
alone. 

Various  methods  have  been  employed  to  obtain  the  values 
of  the  constants  A,  B,  c,  corresponding  to  a  given  experience. 
That  used  by  Makeham,  and  subsequently  in  a  modified  form 
by  Woolhouse,  is  based  on  selected  values  of  log  Ix  taken  from 
a  table  already  graduated  by  a  finite  difference  formula. 
Four  values  of  log  Ix  may  be  taken,  covering  practically  the 
whole  of  adult  life,  say  the  values  at  ages  20,  40,  60,  and  80, 
or  25,  45,  65,  85.  Either  set  are  sufficient  to  determine 
the  four  constants,  K,  A',  B'  and  c,  as  above.  In  Woolhouse's 
graduation  of  the  H^  Table,  both  of  these  sets  of  ages  were 
employed,  the  most  advantageous  values  of  the  constants 
being  found  by  comparing  the  deviations  between  the 
graduated  and  ungraduated  values  of  Ix  at  quinquennial 
ages  according  to  the  two  preliminary  graduations.  If  a 
single  set  of  four  values  of  Ix  is  taken  as  the  basis  of  the 
graduation,  the  effect  is  the  same  as  employing  the  sums  of 
the  forces  of  mortality  (/ix+0  between  the  selected  ages, 
giving  equal  weight  to  the  values  at  each  age. 

The  method  employed  by  Mr.  King  in  the  Institute  of 
Actuaries'  Text-Book,  Part  II.,  substitutes  for  graduated 
values  of  log  Ix  at  isolated  ages,  the  sum  of  certain 
groups  of  the  ungraduated  values  of  log  Ix .  The  effect 
of  this  method  would  appear  to  be  to  give  a  diminishing 
weight  to  the  values  of  ^x  for  the  ages  at  the  commencement 
and  end  of  the  table,  which  is  so  far  in  accordance  with 
theory,  and  to  eliminate  the  effect  of  errors  in  isolated 
values  of  Ix-  In  Biometrika  (vol.  i.,p.  298-303)  Prof.  Pearson 
has  dealt  with  the  same  problem,  basing  the  values  of  the 
constant  upon  the  successive  summations  of  log  Ix . 

It  is,  perhaps,  preferable  to  deal  directly  "with  the  actual 
exposures  and  deaths  in  a  manner  similar  to  that  first 
described  by  Makeham  (/.7.^.,  vol.  xvi,  p.  344).  This  can 
be  readily  done,  and  the  same  method  of  summations  or 
moments  applied  as  in  the  case  of  any  other  frequency  curve. 
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Tabulate  E^+i,  that  is,  the  number  exposed  to  risk  in  the 
middle  of  the  year  of  age  x,  and  ^^  representing  the  deaths 
occurring  between  ages  x  and  x+l.  Assuming,  as  we  may  with 
sufficient  accuracy  for  ordinary  purposes,*  that  the  force  of 
mortality  at  age  x+^,  or  the   function   colog  ePx,  is   equal 


to  mx  the  "  central  death  rate  "  =  ^^^   ,  we  have 


If  we  knew  the  value  of  c,  we  could  then  tabulate  the  values 
of  Ea.+4,  Ejj+ic^^"*"^,  0x  respectively,  and  summing  these  values 
continuously  to  the  end  of  the  table,  and  again  taking  the 
total  of  these  sums,  we  should  obtain  equations  in  this 
form  : — 

(22E,+4)A+  {ttEx^iC-^^)B  =  {-$t0x) 

a  simple  simultaneous  equation  for  determining  A  and  B. 
As  a  matter  of  fact,  the  value  of  logioC  does  not  usually 
differ  very  much  from  "04,  and  in  general  it  will  be  found 
that  a  small  change  in  the  value  of  log  c  does  not  involve  a 
serious  change  in  the  general  character  of  the  table.  In  an 
important  series  of  observations,  however,  we  cannot  assume 
the  value  of  c.  Either  we  must  determine  c  by  a  method 
such  as  that  used  by  Mr.  Woolhouse  or  Mr.  King,  Avhich  will 
give  a  sufficient  approximate  value,  or  we  may  adopt  two  or 
more  alternative  values  of  c,  which  appear  likely  to  contain 
between  them  the  true  value.  Having  obtained  the  values 
of  constants  A  and  B  for  each  given  value  of  c,  set  out 
the  expected  or  graduated  deaths,  and  compare  them  with 
the  actual  numbers  in  suitable  age  groups.  If  the 
third  summation  of  the  differences  of  the  graduated  and 
ungraduated    deaths    is    computed,    it    will   be    possible  by 

•Assniniiif,'  tlio  usual  table  of  Kj-  and  6^.  to  represent  accurately  the  facts 
and    to    be    undisturbed   at    the    older  ages   (where  alone   the    point  is  of   any 

0 
importance)    by    entrances    or    by   exits    other   than   by   death,   then        =yj; 

a 

acurately;    and    colog  ePi=  z-, .  —,    very   nearly,    where  Wj;   is   the 

a 

"  central  death  rate  "  :n — Ta"  •     '^^^  error  caused  by  omitting  the  small  term 

a 

in  the  denominator  and  taking  colog  «pj;=         ^    -   is  only  apjircciable  at  the 

"X  ~\"x 

older  ages,  amounting  to  1  per-cent  in  the  rate  of  mortality  where  2x  =  '3 
or  about  age  90. 
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interpolation  to  obtain  a  value  of  log  c,  making  these  nearly 
equal  to  zero.  Putting  the  matter  into  the  language  of 
moments,  we  shall  then  have  made  the  first,  second  and 
third  moments  of  the  graduated  and  ungraduated  curves 
equal,  and  in  that  way  we  shall  have  selected  what  may  be 
considered  the  best  values  of  the  constants  A,  B  and  c* 

It  may  be  objected  that  the  use  of  this  particular  method 
is  open  to  the  same  implication  of  giving  equal  weight  to  all 
the  observations,  as  in  the  case  of  the  values  of  Ix.  We  can 
avoid  that  objection  by  duly  weighting  the  observations  at 
each  age  by  multiplying  the  "  exposed "  and  "  died "  at 
each  age  by  the  approximately  graduated  values  of  {dx)~^- 
But  although  this  would  give  suitable  weights  to  the 
observations,  if  the  curve  of  mortality  were  a  parabolic 
curve,  or  if  it  were  known  to  follow  accurately  Makeham's 
Law,  it  is  not  quite  clear  that  it  would  do  so  in  practice. 
It  may  be  assumed  that  (when  the  constants  are  formed 
by  reproducing  the  moments  of  the  deaths)  in  not 
weighting  the  observations,  Ave  give  less  weight  to  those  at 
the  commencement  and  at  the  end  of  the  table  than  they  are 
theoretically  entitled  to.  But  this  is  not  a  serious  practical 
objection,  Makeham's  law  is  only  approximately  correct, 
and  as  we  reach  younger  adult  ages  it  begins  to  diverge  from 
the  facts  of  observation ;  on  the  other  hand,  as  we  reach  the 
older  ages  the  actual  importance  of  the  observations  is  less 
than  the  weight  to  which  they  are  theoretically  entitled,  as 
estimated  by  the  number  of  deaths,  owing  to  the  fact  that 
the  actual  mortality  at  those  ages  does  not  materially  affect 
financial  questions  such  as  rates  of  premium  and  reserves. 

Beyond  this  consideration  there  is  also  a  degree  of  doubt 
attaching  to  the  rates  of  mortality  at  extreme  ages  in  any 
table.t  Indeed,  we  may  go  further,  and  say  that  in  all 
considerable  tables  of  statistics  the  numbers  at  the  extremes 
of  the  table  are  proportionately  more  affected  by  sporadic  or 
accidental  errors  of  observation  than  those  in  the  body  of  the 
table.  If  we  suppose  that  in  a  very  small  percentage  of 
cases  the  ages  of  the  "  Exposed  to  Eisk  "  and  "  Died "  are 
affected  by  errors  of  calculation,  clerical  errors  in  tran- 
scribing the  data,  &c. — these  cases  being  removed  from  their 
true  position  and  scattered  at  random  over  the  table — the 

*  See  Note  G,  p.  131. 

+  See  my  notes  on  this  suLject  in  "  PrincijDles  and  Methods  ",  p.  1-48. 
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efPect  upon  the  data  over  the  great  bulk  of  tlie  table  will  be 
insignificant  owing  to  the  large  numbers  under  observation 
and  to  a  balance  of  errors,  but  the  effect  upon  the  experience 
at  the  extremes  of  the  table,  where  the  actual  numbers  under 
observation  are  very  small,  may  well  be  appreciable. 

Reverting  to  the  problem  of  obtaining  the  value  of  c  in 
Makeham's  formula  directly  from  the  observations,  we  may 
endeavour  to  represent  the  curve  of  the  "  Exposed  to  Risk  " 
by  some  frequency  curve  which  can  be  suitably  combined 
with  the  formula  for  jXx  to  represent  the  deaths— such,  for 
example,  as  the  normal  curve  y  =  ke-'=''''\  or  the  curve 
No.  5,  y  =  'kx''ey'',  or  by  the  terms  of  a  binomial  expansion 
{see  Calderon,  J.I.A.,  vol.  xxxv,  p.  157).  Unfortunately  none 
of  these  curves  give  a  very  satisfactory  representation 
of  the  average  form  of  the  "Exposed  to  Risk"  curve. 
In  the  case  of  the  binomial,  in  order  to  get  a  tolerable 
fit,   it  will  be  generally  found  that  the  value  of  n  in  the 

expression    — — (representing  the    general  term   of   the 

binomial)  must  be  taken  small ;  that  is  to  say,  the  data  must 
be  arranged  in  somewhat  large  groups  of  not  less  than  about 
10  ages  to  a  group.  In  either  case  it  Avill  be  necessary,  after 
obtaining  a  frequency  curve  fitting  the  numbers  of  the 
"Exposed  to  Risk,"  to  re-compute  the  deaths  on  the  basis 
of  these  graduated  numbers. 

Thus,  Avhile  it  is  possible  to  determine  the  values  of  c 
directly  from  the  observations,  the  process  is  laborious.  In 
my  opinion,  it  is  preferable  to  use  certain  trial  values  of  c 
which  we  know  to  lie  near  the  truth,  and,  by  a  comparison  of 
the  resulting  graduated  deaths  with  the  original  facts,  to 
select  a  value  which  appears  to  give  the  best  general 
agreement,  which  may  not  always  be  that  making  the  third 
summation  of  the  deviations  zero."^ 

There  is  a  further  point  to  be  considered  with  respect  to 
the  nature  of  the  differences  between  the  original  numbers, 
whether  of  deaths  or  of  other  observations,  and  the  numbers 
obtained  by  a  graduation  following  a  formula  such  as  that  of 
Makeham.  These  divergences  between  the  ungraduated 
and  graduated  numbers  will  in  part  arise  from  the  smallness 
of  the  numbers  under  observation,  and  may  in  part  arise 
from  the  fact  that  tlie  formula  does  not  accurately  represent 

«•  ,S'tfe  Note  G,  p.  133. 
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the  true  curve  of  mortality.  For  the  majority  of  mortality 
tables,  for  male  lives  at  the  adult  ages,  Makeham's  formula 
is  so  near  the  truth  that  we  may  in  practice  neglect  the 
systematic  errors  and  assume  that  the  formula  represents 
the  true  curve  of  mortality,  determining  our  constants  as 
though  the  whole  of  the  deviations  in  the  graduated  and 
ungraduated  curves  are  accidental  and  due  to  the  smallness 
of  the  data,  but  for  some  tables,  notably  those  representing 
the  mortality  of  females,  this  will  not  be  the  case. 

Other  expressions  may  be  given  representing  approximately 
the  curve  of  ixx,  as,  for  instance, 

lx^  =  ma^-\-nh'' (1) 

whence 

logioZx=K+Ma^  +  N6^ (2) 

an  expression  which  enables  us  to  represent  some  mortality 
tables,  such  as  those  arising  from  tropical  experience,  that 
are  not  very  readily  represented  by  Makeham's  formula. 
The  values  of  these  constants  can  be  readily  obtained  either 
from  5  selected  values  of  log  la-  or  from  the  sums  of  the  values 
of  selected  groups  of  the  same  function. 

The  above  formula  for  l^  preserves  in  a  modified  form  the 
principle  of  uniform  seniority.  Not,  however,  in  a  very 
practicable  shape  as  in  order  to  compute  values  of  joint-lives 
(any  number)  we  require  tables  of  h  joint-lives  of  equal  age 
for  various  values  of  h.  It  is  of  course  evident  from  general 
considerations  that  the  force  of  mortality  on  any  number  of 
joint-lives  must  consist  of  two  terms,  each  of  which  is 
a  member  of  a  geometrical  progression,  and  that  if  we  can 
find  an  a^e  w  where  the  relative  values  of  these  two  terms  is 
the  same  as  in  the  joint-life  status,  the  actual  values  will 
be  the  same  when  multiplied  by  some  suitable  constant  ^^ 
The  required  joint-life  annuity  will  then  be  represented  by 
the  annuity  on  fc  joint-lives  all  of-age  w. 

Take  as  an  example  an  annuity  on  the  joint-lives  of  [x) 
and  (i/).     Find  h  and  w  so  that 

log  (rt-^'  -f-  ay)  — Iog(&-^  +  &^) 


a^  +  aJ/  =  ka'^ 


whence  w=  ,  it, 

log  a  — logo 


and        k={a^  +  ay)--^a'"=[h''  +  hy)^l'« 

Then  it  is  obvious  that  if  we  replace  'X  and  y  by  x  -f- 1  and  t/  4-  ^ 
'k  will  remain  unaltered  and  w  will  become  10  + 1,  so  that  the 
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principle  of  uniform  seniority  is  maintained.  Thus,  an 
annuity  on  x  and  y  will  be  equal  to  an  annuity  on  h  lives  all 
aged  10 ;  or,  since  h  will  not  generally  be  integral,  it  will  be 
more  convenient  to  say  that  a^y  =  a  to  where  a'  is  calculated  at 
forces  of  mortality  always  h  times  the  normal  force,  age  for 
age.  Thus,  we  shall  require  tables,  for  various  standard 
values  of  k,  and  we  shall  usually  require  a  double  interpolation, 
since  neither  ic  nor  k  will  usually  be  integral. 

The  principle  of  employing  the  sum  of  two  (or  more) 
geometrical  series  to  represent  the  logarithm  of  a  function 
such  as  the  number  living  may  also  be  used  Avith  advantage, 
as  will  be  seen  later  on,  for  census  tables.  (See  the  Sixth 
Lecture.) 

As  an  example  of  this  formula,  we  may  apply  it  to  the 
column  of  log  ?^  in  the  O^  Table. 

Taking  the  values  of  log  Zj;  for  ages  20,  37,  54,  71  and  88, 
we  have  the  following  data  : 

log  Zao  =  4-  98432  =  K  +  Ma^o  +  N?>2o 

log  ?37 =4-94279  =  K  +  Ma37  +  m^7 

log  l,^  =  4-85300  =  K  +  Ma^^  +  N&^* 
log  Z.,  =  4-58086  =  K  + Ma"' +  N6'i 
log  ^88=  3-47509  =  K  +  Ma^s  +  N//^ 
whence  differencing,  and  writing 

Ma^(a'--1)  =  -M';  X?y-=«(?/--l)  = -N' ;  a''=a;  h'^  =  l3; 

we  have 

M'  +  X'  =  log/2o-logZ37=   •04153  =  A 

M'a  +  X'/3  =  log/,7-log/54=   -08979  =  6 

MV  +  N';82=log/M-logZ;i=   -27214=0 

MV  +  N'/S3  =  logZ;,-log788=l-l()577  =  D 

whence,  noting  that 

BD-C^        ^     AD-BC         ,  ,3 
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we  easily  obtain : 

a=5-lU82        ;    a=    1-1007 

y3=  1-5243        ;    h=    1-0251 

M'=  -0073886;  M= --00026403 

N'=   -0341414;  N= --039657 

The  following  comparison  of  the  values  of  Z.^  and 
decrements  for  quinquennial  ages  will  indicate  the  approxi- 
mation of  the  formula  to  the  0*^  Table. 

Table  XI. 

Values   of  Ix    c^nd    of   {l.v  —  lx+i)    according  to  the   O*^  Table,  as 
compai'ed   toith    re-graduation    hy  formula  (2). 


Age 

I 

X 

Quinquennial  Deceements 

Errors 

I 

By 

Formula 

Original 
Value 

By 

Formula 

Original 
Value 

1 

+ 

— 

20 

96,453 

96,453 

2,129 

2,066 

63 

25 

94,324 

94,387 

2,467 

2,445 

22 

•  •• 

30 

91,857 

91,942 

2,896 

2,947 

•  •  • 

51 

35 

88,961 

88,995 

3,443 

3,528 

■  ■  • 

85 

40 

85,518 

85,467 

4,158 

4,205 

.  .  > 

47 

45 

81,360 

81,262 

5,108 

5,077 

31 

•  .. 

50 

76,252 

76,185 

6,350 

6,266 

84 

... 

55 

69,902 

69,919 

7,927 

7,846 

81 

60 

61,975 

62,073 

9,775 

9,766 

9 

65 

52,200 

52,307 

11,606 

11,692 

86 

70 

40,594 

40,615 

12,753 

12,863 

110 

75 

27,841 

27,752 

12,192 

12,222 

30 

80 

15,649 

15,530 

9,244 

9,171 

73 

85 

6,405 

6,359 

4,827 

4,763 

64 

90 

1,578 

1,596 

1,406 

1,410 

"4 

95 

172 

186 

167 

179 

12 

100 

5 

7 

5 

7 

... 

2 

FIFTH     LECTURE. 


Although  m  the  preceding  Lecture  the  application  of 
Makeham's  formula  has  been  considered  at  some  length,  its 
importance  is  such  that  we  may  now  touch  on  some  further 
points,  and  particularly  on  the  application  of  the  formula  to 
the  graduation  of  select  tables. 

The  suitability  of  Makeham's  formula  to  the  graduation 
of  mortality  tables  must  be  judged  as  we  should  judge  the 
applicability  of  any  other  frequency  curve  to  a  given  series  of 
observations.  That  is  to  say,  we  must  consider  whether  the 
observed  differences  between  the  graduated  and  ungraduated 
values  (the  computed  and  actual  deaths)  fall  within  what 
may  be  properly  considered  to  be  the  limits  of  error. 
For  practical  purposes,  owing  to  the  great  convenience 
attaching  to  tlie  use  of  the  formula,  it  is  worth  while  to 
stretch  a  point  in  its  favour.  Instead,  therefore,  of  merely 
considering  the  closeness  of  the  agreement  between  the 
actual  and  computed  deaths,  we  may  consider  how  nearly  the 
ungraduated  and  graduated  monetary  functions,  such  as  the 
values  of  premiums  or  annuities,  are  in  agreement.  If  this 
agreement  is  sufficient  for  our  purpose,  we  are  justified 
in  adopting  the  graduation  as  given  by  the  fornuila, 
notwithstanding  the  fact  that  at  certain  groups  of  ages  the 
divergences  between  the  graduated  and  ungraduated  deaths 
may  be  greater  than  would  be  expected  from  the  theory  of 
probabilities.  In  this  connection  it  is  to  be  noted  that  our 
observations  relate  to  past  time,  and  that  the  quantities  we 
are  measuring  are  all  liable  to  change  with  time.  Hence  in  a 
graduation  intended  to  form  the  basis  of  tables  of  annuities  or 
premiums  it  is  sufficient  if  the  general  character  of  the 
experience  is  retained  without  insisting  too  strongly  upon  a 
strict  adherence  to  minor  features.  This  is  illustrated  by  the 
following  table  from  "  Principles  and  Methods"  (p.  162),  in 


72 


which  we  may  anticipate  for  the  moment  the  question  of  the 
application  of  Makeham's  formula  to  select  tables  : 

Q[^]  Whole-Life  Participating — Males. 
3  per-cent  Premiums  for  £100  Assured. 


Age 

P[.l 

G-U 

Sprague's 
HtM]  Select 

HtMl-Ot^H 
-(3) 

+ 

Ungraduated 

Graduated 

+ 

— 

(1) 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 

(2) 
1-379 
1-535 
1-779 
2-086 
2-453 
2-952 
3-571 
4-338 
5-413 
6-872 

(3) 
1-365 
1-551 
1-785 
2-081 
2-457 
2-940 
3-564 
4-377 
5-446 
6-854 

(4) 

-016 
-006 

-004 

-039 
-033 

(5) 
-014 

-005 

-012 
-007 

-6i'8 

1-563 
1-703 
1-925 
2-218 
2-602 
3106 
3-755 
4-635 
5-827 
7-433 

(7) 
-198 
•152 
•140 
•137 
-145 
•166 
•191 
•258 
•381 
-579 

Average 

3-238 

3-222 

-004 

... 

3-477 

•235 

Here  columns  (4)  and  (5)  show  how  far  the  graduated  select 
annual  premium  P^^^  for  each  age  at  entry,  differs  from  the 
ungraduated  value  for  the  same  age,  while  column  (7)  shows 
how  far  the  annual  premiums  deduced  by  Dr.  Sprague  from 
the  H'^^  data  [Journal  of  the  Institute  of  Actiiaries,  vol.  xxii, 
p.  391)  differ  from  the  premiums  deduced  from  the  0^^^ 
Experience.  The  average  difference  between  the  graduated 
and  ungraduated  premiums  (irrespective  of  sign)  amounts  to 
'015  per  £100  assured,  a  quite  insignificant  amount;  whereas 
the  difference  between  the  premiums  representing  the  earlier 
experience  and  those  of  the  0'^^  Table,  representing  the 
experience  of  30  years  later,  are  all  positive  and  average  "235 
per  £100  assured. 

Only  a  part  of  the  differences  shown  in  columns  (4)  and  (5) 
are  due  to  any  systematic  difference  between  the  mortality 
as  shown  in  the  0^**^  data  and  that  assumed  by  the  formula. 
Assuming,  however,  that  the  entire  differences  were  due  to 
this  cause,  it  will  be  seen  that  the  changes  introduced  into 
the  values  of  the  monetary  functions  by  using  Makeham's 
formula  are  a  very  small  percentage  of  the  actual  change 
that  has  occurred  in  the  value  of  these  functions  during  the 
course  of  30  years. 
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Although,  therefore,  the  differences  between  the  graduated 
and  ungraduated  deaths  do  at  certain  points  somewhat 
exceed  the  limits  of  the  errors  of  observation,  we  are  justified 
in  using  the  graduated  table  as  a  standard  for  the  future. 

Each  case  must,  of  course,  be  decided  upon  its  own  merits, 
and  while  the  H^  Experience  and  the  0^  Experience  have, 
with  other  tables,  proved  to  be  amenable  to  Makeham's 
formula,  the  latter  cannot  be  treated  as  a  "  law  of  mortality  ", 
to  which  all  tables  may  be  expected  to  conform.  As  already 
stated,  its  suitability  must  be  tested,  as  that  of  any  other 
frequency  curve,  but  with  rather  more  latitude  owing  to  its 
practical  advantages.  In  particular  the  formula  is  not 
generally  suitable  for  tables  representing  the  mortality  of 
Eemale  Lives. 


In  the  last  lecture  we  considered  various  methods  of 
determining  the  constants  of  Makeham's  formula  for  /i..^,  best 
representing  a  given  mortality  experience,  in  particular 
that  depending  upon  the  agreement  between  the  totals  of  the 
graduated  and  ungraduated  deaths  and  of  their  successive 
summations.  We  have  so  far,  however,  considered  the  force 
of  mortality  as  a  function  of  the  age  only,  so  that  our  results 
are  applicable  only  to  "mixed"  tables  of  mortality,  not  to 
"select"  tables  in  which  the  mortality  is  treated  as  a  function 
both  of  the  age  of  the  life  and  of  the  duration  of  tlie 
assurance. 

The  formula  owes  its  value,  beyond  the  incidental 
advantage  that  it  gives  us  a  very  simple  and  effective 
method  of  graduation,  to  the  relation  it  establishes  between 
the  value  of  an  annuity  upon  joint  lives  of  any  age  and  that 
of  an  annuity  upon  the  same  number  of  joint  lives  of  equal 
age.  From  the  formula  for  the  force  of  mortality  according 
to  Makeham's  hypothesis 

/x^  =  A  +  Bc^ 

it  follows  that  the  force  of  mortality  for  any  number  of  joint 
lives,  aged,  for  example,  at  entry  a;,  y,  z,  is  given  by  the 
formula 

where  c'^'=  „  (c^  4-  r''  +  c-) 
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where  t  represents  the  period  elapsed  since  the  date  of  entry. 
As  a  value  of  w  satisfying  this  equation  can  always  be  found, 
and  is  independent  of  t,  it  follows  that 

It  is  seen  that  the  relation  subsisting  between  the  value 
of  w  and  the  values  of  x,  y,  z,  involves  the  constant  c  only, 
and  not  the  constants  A  and  B ;  hence,  any  variation 
introduced  into  the  values  of  the  constants  A  and  B,  having 
reference  to  the  time  elapsed  since  selection  and  depending 
only  on  t,  will  not  affect  the  relation  between  the  age  w 
and  the  ages  x,  y,  and  z.  We  can,  therefore,  write  the 
force  of  mortality  at  age  x  +  t  for  a  life  select  at  age  x  as 
follows  : 

/^w+e  =  A+/(0  +  [B  +  ^(0]c-+^        .     .     .     (1) 
and  still  retain  the  relation 

when  c'"  =  -  (c^  +  c^'  +  c=) . 

o 

Equation  (1)  may  obviously  be  written  in  the  form 

/t[,Ht=A,  +  B,c*+^ (2) 

or  alternatively,  if,  as  is  often  more  convenient,  we  work  with 
the  values  of  colog^^.i-^  in  the  form 

cologioi/[,r]+^  =  a^  +  Ac-'^+^ (3) 

where  A,  and  B^,  or  at  and  ^t,  may  be  any  functions  of  t,  but 
are  not  functions  of  x.  We  can  thus  represent  the  rate  of 
mortality  as  a  function  both  of  the  age  and  of  the  time  elapsed 
since  selection  and  so  approximate  fairly  to  the  rates  of 
mortality  shown  in  an  "  analyzed "  or  "  select "  mortality 
experience,  while  retaining  most  of  the  advantages  arising 
from  the  use  of  Makeham's  formula.  The  two  functions  of  t 
have  probably  a  tendency  to  become  constant  as  t  increases 
but  do  not  necessarily  become  so  within  any  special  period 
from  the  date  of  entry ;  they  may  continue  to  change  slowly 
throughout  the  whole  duration  of  the  table,  and  in  theory,  no 
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doubt,  should  do  so,  but  for  practical  purposes  it  is  convenient 
to  make  them  constant  after  a  few  years  (say  5,  or  at  most  10) 
from  the  date  of  entry,  beyond  which  point  it  is  assumed  that 
the  eilect  of  "  selection  "  has  worn  off. 

If  we  set  out  separately  the  data  for  each  year  of  assurance, 
that  is,  for  each  value  of  t  so  far  as  we  intend  to  trace  selection, 
we  shall  have  a  series  of  equations  (corresponding  to  those 
shown  on  p.  65  for  an  aggregate  table)  for  determining  the 
numerical  values  of  the  functions /(O),  /(I),  &c.,  ^(0),  <^(1), 
&c.,  the  value  of  c  being  necessarily  that  determined  for 
the  "  ultimate ''  table.  In  other  words,  the  data  for  each 
year  of  duration  are  treated  as  representing  a  mortality 
table  complete  in  itself.  We  obtain  in  this  way  values  for  A, 
and  B,  or  for  at  and  fit  for  each  value  of  t,  so  far  as  it  is 
proposed  to  carry  the  select  tables.  Unless,  however,  the 
experience  is  a  very  large  one,  these  values  will  be  very 
irregular.  Indeed,  in  the  case  of  the  0^^  data,  which  repre- 
sent a  large  experience,  we  have  somewhat  irregular  values 
for  a,  and  fit,  even  during  the  first  ten  years  of  assurance, 
where  the  facts  are  most  numerous.  The  approxunate  values 
of  at  and  fit  for  the  0*^  data  are  given  on  p.  157  on  "  Principles 
and  Methods.^'  If  these  values  are  plotted  out,  the  resulting 
curves  exhibit  certain  obvious  characteristics,  as  will  be 
seen  by  the  diagrams  opposite  where  the  regular  lines  show 
the  ungraduated,  and  curved  lines  graduated  values  of  at 
and  fit,  and  the  horizontal  lines  after  10  years  represent  the 
values  for  the  experience  of  10  years'  duration  and  upwards, 
when  they  are  assumed  to  be  constant.  A  period  of  10  years 
would  appear  from  the  data  to  be  the  shortest  within  which 
we  can  effect  anything  like  a  smooth  junction  between  the 
"select"  and  "ultimate"  mortality  rates. 

The  values  of  at  rise  very  rapidly  in  the  first  few  years 
of  assurance,  but  after  about  6  or  7  years  they  appear  to 
approach  nearly  their  final  value.  In  the  case  of  fit,  however, 
we  see  that  if  the  graduated  curve  were  drawn  as  closely  as 
is  consistent  with  smoothness  through  the  ungraduated  values, 
it  would  probably  not  reach  the  level  of  the  ultimate  value 
•0000406  until  after  15  years  from  entry,  and  even  then  it 
would  be  below  the  value  of  fit  for  durations  of  15  years  and 
over.  Hence  it  would  seem  that  the  value  of  fi,  does  not 
become  constant  until  about  20  years  have  elapsed  from  tliei 
date  of   entry.     "NVc  may  almost  say  that  while  the  effect  of 
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selection  as  reflected  in  the  values  of  the  a  constant  disappears 
after  about  7  years,  the  effect  upon  the  values  of  /3  probably 
continues  throughout  the  whole  of  life.  The  explanation  is, 
no  doubt,  that  the  a  or  A  constant  represents  mortality  from 
accidental  causes  and  from  non-constitutional  diseases  of  short 
duration,  whereas  the  /3  or  B  constant  represents  mortality 
due  to  diseases  of  longer  duration  and  to  constitutional 
defects. 

Having  obtained  numerical  values  of  a,  and  /3^  for 
successive  values  of  t,  it  remains  to  represent  these  values 
by  convenient  formula.  The  fact  that  the  function  /S,  does 
not  reach  its  ultimate  value  at  the  end  of  10  years  from 
entry,  involves  either  some  sacrifice  of  the  agreement  between 
the  adjusted  and  unadjusted  values  of  this  function,  or  a 
continuation  of  the  analyzed  mortality  rates  beyond  the  period 
of  10  years,  which  is  not  very  convenient.  In  consequence  of 
this  fact  we  cannot  apply  the  method  of  moments  in  fitting 
a  graduated  curve  to  these  values.  Where  the  fitting  of  a 
frequency  curve  involves  any  systematic  departure  from  the 
original  facts,  the  method  of  moments  often  gives 
unsatisfactory  results,  and  a  curve  may  be  produced 
departing  more  widely  from  the  observations  than  if  derived 
by  a  tentative  method. 

In  selecting  formula?  for  graduating  the  rough  values 
of  at  and  /?/,  there  are  certain  conditions  which  should  be 
fulfilled  : 

1.  A  smooth  junction  between  the  curves  representing  the 
select  and  ultimate  tables. 

2.  An  agreement  between  the  graduated  and  ungraduated 
values  of  at,  ^t  in  year  0,  as  a  special  importance  attaches  to 
the  rate  of  mortality  in  the  first  year  of  assurance. 

3.  An  agreement  between  the  aggregate  graduated  and 
ungraduated  values  of  these  functions  during  the  period 
between  the  date  of  entry  and  the  ultimate  table. 

To  conform  to  these  conditions  as  far  as  possible,  we 
must  select  a  curve  for  the  values  of  yS^  which,  whilst 
running  smoothly  into  the  constant  value  at  the  end  of  ten 
years,  will  represent  fairly  well  the  distinctly  lower  values  of 
^t  in  the  years  immediately  preceding.  This  may  be  done  by 
representing  the  difference  between  log  l^+t  (the  value  of  this 
function  in  the  "ultimate^'  table)  and  log /[.,]+,  (the  value  in 
the  "  select "  table)  so  far  as  this  difference  is  due  to  changes 
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in  ^i,  by  an  expression  of  the  form  n{10—tY^c^,  where  ^  is 
the  ultimate  value ;  whence  we  have  the  corresponding- 
difference  : 

=  27i(10-0/3c-^ 

=  2?i(10-Oc-^y3c-^-+' 

so  that  yS^  =  [l-2?i(10-Oc-^]/8. 

The  result  of  this  is  to  eliminate  from  the  j3  constant  at 
the  latter  durations  part  of  the  effect  of  selection,  and 
somewhat  to  exaggerate  the  effect   in  earlier  years. 

We  have  now  to  decide  as  to  the  curve  best  representing 
the  values  of  at.  The  method  employed  will  depend  veiy 
much  on  the  character  of  the  experience  we  are  treating.  In 
the  C^*^^  Experience  it  was  again  found  convenient  to  adopt 
an  expression  for  the  difference  of  logioZaj+j;  and  logioZ|;^]+i>,  so 
far  as  this  difference  was  due  to  change  in  at,  containing  a 
term  similar  to  that  due  to  /3j  v/ith  the  addition  of  a  further 
term  representing  a  geometrical  series  rapidly  diminishing  as 
t  increased.  The  final  form  of  the  equation  for  the  0^'^'^ 
Experience  was  as  under — 

Having  determined  the  form  of  this  equation,  the  simplest 
method  for  determining  the  constants  is  to  express  in  terms 
of  them  the  difference  between  the  computed  deaths  by  the 
ultimate  table  of  mortality,  and  the  actual  deaths  for  each  age 
or  each  group  of  ages  and  each  year  of  assurance. 

We  have  in  that  way  a  series  of  equations  for  determining 
the  values  of  these  constants  m,  m',  c  ,  n,  and  hence  of 
A,  and  B,  for  each  value  of  t,  similar  in  principle  to  the 
equations  used  for  determining  the  values  of  the  original 
constants  A  and  B.  The  only  point  that  arises  is  as  to  Avhat 
particular  way  we  are  to  group  the  observations  to  determine 
those  values. 

The  value  of  m  in  the  above  formula  having  been 
ascertained  witli  a  view  to  representing  as  nearly  as 
practicable  the  effect  of  selection  upon  the  constant  B,  there 
remain  in  all,  four  unknown  quantities  in  the  fonnula 
to    bo     determined,     and     the     actual     equations     used     to 
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determine  them  were  formed  by  taking  the  first  and  second 
summations  by  ages  of  the  whole  of  these  expressions, 
representing  the  difference  between  the  "  select "  and 
"  ultimate  ''  rates,  first  for  year  of  assurance  0  alone,  and  then 
for  the  whole  of  the  ten  years. 

The  selection  of  these  particular  groups  is,  of  course,  not 
a,  question  of  principle,  but  of  convenience.  Each  case  must 
be  treated  with  reference  to  the  nature  of  the  curve  of 
selection,  as  brought  out  by  the  statistics,  and  such  a  process 
adopted  as  appears  to  be  calculated  to  bring  out  the  best 
results  in  the  particular  case  in  question. 

It  may  happen  in  certain  tables  that  it  is  inconvenient  to 
trace  out  the  effect  of  selection  for  so  many  years,  and  in 
particular  this  is  the  case  in  a  table  representing  the  mortality 
of  annuitants.  In  such  a  table  the  effect  of  selection  (which 
is  here  the  self-selection  of  the  annuitant)  persists  for  a  very 
long  time.  In  a  table  of  insured  lives,  owing  to  the  cessation 
of  new  entrants  in  middle  life,  practically  at  about  age  55, 
the  mortality  at  the  older  ages  is  but  slightly  affected  by 
selection.  In  the  case  of  annuitants,  where  there  is  a 
constant  inflow  of  fresh  lives  up  to  75  or  80  years  of  age, 
the  mortality  is  affected  by  this  cause  throughout  the  Avhole 
extent  of  the  table.  To  completely  represent  the  effect  of 
selection  in  such  an  experience  will  require  an  elaborate  series 
of  tables,  showing  for  each  entry  age  the  value  of  annuities 
for  each  year  elapsed  since  entry  for  many  years  duration. 
The  tables  given  in  "'  Principles  and  Methods",  pp.  124,  125, 
show  that  as  regards  the  0"'"  and  0"-^  Experience,  and  doubtless 
the  same  feature  would  be  found  to  be  general,  the  values  of 
the  expectation  of  life  ten  years  after  entry  are  appreciably 
greater  than  the  values  for  the  same  ages  derived  from  the 
"ultimate"  rate  of  mortality  (e[j.]+io>e,i.+io).  Consequently,  if 
the  graduated  rates  of  mortality  for  the  first  five  or  ten  years 
froin  entry  are  employed  in  conjunction  with  i*ates  representing 
the  aggregate  mortality  after  five  or  ten  years,  as  the  case 
may  be,  the  ultimate  values  of  the  annuities,  and  also  the 
values  of  the  date  of  entry  will  on  the  whole  be  under- 
estimated. In  any  table  used  for  the  grant  of  annuities 
it  is,  however,  most  important  that  annuities  at  the  date  of 
entry  shall  not  be  undervalued,  and  of  only  less  importance 
that  the  values  in  succeeding  years  shall  be  such  as  may 
be   safely   employed   in   estimating   reserves.      Any  method. 
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therefore,  of  treating  an  annuity  experience  Avhicli  tends  to 
underestimate  the  values  of  annuities  is  clearly  unsuitable. 
Full  weight  must  accordingly  be  given  to  the  effect  of 
selection,  but  to  avoid  the  heavy  Avork  involved  in  a  complete 
analysis,  the  expedient  may  be  adopted  of  computing  a 
hypothetical  table  of  mortality  Avhich  will  correspond  to  the 
values  of  the  annuities,  let  us  say,  five  years  from  the  date 
of  entry.  If  this  can  be  done  successfully  and  the  rates  of 
mortality  for  the  first  five  years  joined  on  smoothly  with  the 
rates  in  such  hypothetical  table,  we  shall  then  have  a  correct 
measure  of  the  value  of  annuities  at  entry  and  for  the  five 
years  following,  while  thereafter  the  values  will  be  slightly, 
but  not  seriously,  overestimated,  an  error  which  Avill  be  on 
the  right  side. 

We  may  take  as  our  basis  either  the  values  of  the 
"  expectation  of  life "  or  of  the  annuities  at  a  suitable  rate 
of  interest.  We  will  assume  the  former  to  be  adopted.  As 
these  values  (e[r]+5)  will  depend  upon  separate  groups  of  data, 
\\z.,  the  entrants  at  individual  ages,  it  will  not  be  practicable 
to  construct  an  ungraduated  table   of  jp^  from  the  formula 

g 
Px=T—r^ the  irregularities  in  the  individual  values  of  e^ 

leading  to  anomalous  results.  A  better  plan  will  be  to 
graduate  the  table  of  expectations.  For  this  purpose,  we 
may  assume  any  frequency  curve  which  will  represent  these 
expectations  satisfactorily,  for  example,  a  curve  such  as 
\ogxQex=^a  +  hx  +  cx^  +  da'?-itfx^.  We  may  employ  values  of  Bx 
deduced  from  the  experience  of  individual  ages  at  entry, 
or  we  may  combine  the  entrants  in  quinary  groups  of  ages, 
taking  due  account  of  the  true  average  age  of  each  group 
of  entrants. 

The  only  point  of  importance  where  difficulty  arises  is  the 
weighting  of  the  different  equations.  These  are  not  of  equal 
weight  because  the  expectations  of  life,  as  deduced  from  the 
unadjusted  experience,  are  based  upon  a  smaller  or  larger 
experience,  as  they  fall  at  the  extremes  or  in  the  middle  of 
the  table,  and  some  method  must  be  devised  for  giving  due 
Aveight  to  this  fact.  This  may  be  done  by  simply  Avoighting 
the  equations  Avith  the  actual  number  of  entrants  at  that 
particular  age,  and  much  may  be  said  for  this  method 
although  it  slightly  underestimates  the  Aveights  at  the 
extreme   ages.     If    Ave   are    dealing,    for  example,    Avith   the 
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values  of  annuities,  and  approximately  the  same  result  will 

be  arrived  at  when  working  with  the  expectations  of  life,  the 

plan  of  weighting  the  unadjusted  values  in  proportion  to  the 

number  of  lives  entering  at  each   age,  would  make  the  total 

cost  of  all  the  annuities  by  the  graduated  table  the   same  as 

by  the  ungraduated,  an   agreement  that   would   have  some 

practical  value.      In  the  alternative,  we  may  consider  that 

each  value  of  the  expectation  of  life  (or  of  the  annuity,  as 

the  case  may  be)   should  be  weighted  in  proportion  to  the 

reciprocal  of  its  average  error.     Thus  if  e[.i]+5  =  A  +  z,  where 

A  is  the  observed  value  and  z  the  average  error,  we  shall  have 

e  .  A 

-M+5  _ ^^      j^  |g  difficult  to  determine  satisfactorily  the 

z  z  —  '' 

average  error  in  the  value  of  the  unadjusted  expectation  of 
life,*  the  problem  being  complicated  by  the  incompleteness 
of  the  observations  due  to  the  "  existing."  A  fairly 
satisfactory  method  of  estimating  the  average  error  would 
be  as  follows.  Taking  the  series  consisting  of  the  values 
of  e.x  for  all  values  of  x,  each  of  those  values  depending  on 
a  given  age  at  entry  only,  we  may  assume  that  the  observed 
second  differences  of  these  quantities  e^-x  —  ^e^  +  ex+i,  which, 
in  a  well  graduated  table,  would  be  very  small,  are  due  to 
the  errors  of  observation  in  the  values  6^-1,  e^,  and  e^+x.  In 
any  particular  group  of  entry  ages,  we  may  say  that  the 
average  of  the  central  second  differences  (taken  irrespective 
of  sign)  will  be,  on  the  average,  proportional  to  the  average 
error  in  e^  for  that  particular  group. t  Computing  the  average 
values  of  the  central  second  differences  (without  sign),  for 
various  sections  of  the  table,  and  drawing  a  smooth  curve 
through  them,  we  should  obtain  values  from  which  suitable 
relative  weights  for  the  individual  observations  could  be 
deduced. 

This  Avould  be  a  very  fair  method  of  determining  practically 
the  weight  to  be  attached  to  the  values  of  e^.  in  different  parts 
of  the  table.  Or  we  may  proceed,  as  was  actually  done  in 
the  case  of  the  annuity  experience  graduation,  by  assuming 
the  error  in  the  value  of  e^  to  be  a  function,  first,  of  the  total 
number  of  deaths  in  the  experience  representing  the  particular 
entry  age,  and  secondly,  of  the  age  x.  This  method  may 
appear  somewhat  arbitrary,  but  as  only  the  relative  weights 

*  See,  however,  the  Sixth  Lecture,  pp.  100-104. 
fThe  average  value  of  Cj; - 1  —  2e^  +  e^+i  ^'iH  '-^e  '^^^  times  the  average  error  in  ex. 
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are  in  question,  it  is  sufficient  for  the  purpose.  It  must  be 
understood  that  the  relative  weights  adopted  do  not  very 
greatly  affect  the  results.  The  values  of  Makeham's  constants 
as  deduced,  for  example,  from  the  values  of  log  l^  for  ages 
25,  45,  65,  85,  thus  giving  equal  weight  to  the  observed 
value  of  mortality  from  ages  25  to  85,  would  not  generally 
differ  materially  from  the  values  resulting  from  a  careful 
system  of  weighting,  although,  of  course,  the  latter  are  to 
be  preferred. 

Assuming  the  "  exposed  to  risk "  to  remain  unchanged, 
the  average  error  in  the  observed  number  of  deaths  is 
approximately  +  *8  Vnq^i  —  q)  where  n  is  the  total  of  the 
"  exposed  to  risk  "  and  nq  the  total  deaths.  The  average 
percentage   error   in   the   total    deaths    will,    therefore,    be 

proportionate  to  +^J ^  •      If  we  suppose  that  this  average 

error  is  distributed  uniformly  through  all  ages  passed  through 
by  the  particular  group  of  entrants,  we  can  then  arrive  at  a 
rough  estimate  of  the  average  error  in  the  observed  value  of 
^xi  by  computing  the  effect  of  a  change  of,  say,  1  per-cent  in 
the  mortality  rates  throughout. 

The  assumptions  here  are  not  strictly  accurate,  as  errors 
in  the  value  of  e^,  arise  not  only  from  the  total  number 
of  deaths  being  greater  or  less  than  the  expected  amount, 
but  from  the  manner  in  which  the  excess  or  defect  of 
mortality  is  distributed  through  the  table.  The  neglect  of 
this  second  source  of  error  will  not,  however,  seriously  affect 
the  relative  weights  arrived  at,  and  for  practical  purposes  the 
relative  average  errors  in  the  value  of  6^,  will  bo  dependent, 
first,  on  the  average  error  in  the  total  deaths  observed  in  the 
experience  from  which  it  is  deduced,  and  second,  on  the 
extent  to  which  a  given  percentage  error  in  the  mortality 
distributed  unifori^ily  through  the  table  will  affect  the  value 
of  ejc.  The  product  of  these  two  factors  may  be  taken  as 
representing  sufficiently  approximately  the  expected  error  in 
the  value  of  e^,  remembering  always  that  this  estimated  error 
is  not  an  absolute,  but  a  relative  measure  at  the  various  ages. 
When  this  is  done,  we  have,  by  taking  the  reciprocals  of 
those  quantities,  the  weights  which  we  shall  give  to  the 
observed  values  of  e^  in  order  to  determine  our  constants. 

It  is  necessary  to  point  out  that  this  process,  while  suitable 
for  expectations  calculated  from  entrants  at  a  particular  age 
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or  small  groups  of  ages,  will  not  apply  to  aggregate  tables ; 
for  in  their  case  tlie  percentage  error  in  the  total  deaths 
above  age  x  steadily  increases  as  x  increases,  so  that  this 
method  would  produce  weights  steadily  diminishing  from  the 
youngest  age  to  the  oldest,  which  would  obviously  be 
incorrect. 


Notwithstanding  the  important  effect  of  selection  on 
mortality,  it  is  frequently  ignored,  as  in  the  H"^^  and  0^^ 
Tables.  It  is  important  to  consider,  therefore,  what  is  the 
net  effect  in  a  mortality  table  of  neglecting  altogether  the 
factor  of  selection.  Considerable  additional  labour  attaches 
to  the  use  of  select  tables  for  valuation  purposes,  and  the 
question  may  be  asked  what  kind  of  errors  do  we  make  if  we 
neglect  the  fact  that  mortality  is  a  function  not  only  of  the 
age,  but  also  of  the  duration  of  assurance,  and  treat  it  simply 
as  a  function  of  the  age  as  it  is  treated  in  the  0^^  and  H^^ 
Tables.  In  a  mortality  table  representing  assured  lives  the 
effect  will  be  seen  if  we  compare  a  table  like  the  H^  Table 
with  a  table  like  Dr.  Sprague's  Select  Table,  or  if  we  compare 
a  table  such  as  the  O^^  Table  with  a  table  like  the  0^^^' 
Select  Table : 

Comparison  of  Annual  Premiums  for  the  Assurance  of  100 

(S  per-cent  interest.) 


Age 

HM 

H[M] 

OM 

OtM] 

Sprague 

20 

1-427 

1-563 

1-306 

1-365 

25 

]-625 

1-703 

1-524 

1-551 

30 

1-880 

1-925 

1-790 

1-785 

35 

2-193 

2-218 

2-116 

2-081 

40 

2-589 

2-602 

2-524 

2-457 

45 

3-114 

3-106 

3046 

2-940 

50 

3-801 

3-755 

3-730 

3-564 

55 

4-725 

4-635 

4-641 

4-377 

60 

5-987 

5-827 

5-872 

5-444 

65 

7-705 

7-433 

7-557 

6-853 

If  we  compare,  as  is  most  convenient,  either  annuity  or 
premium- values,  we  shall  find  that  the  effect  of  ignoring  the 
element  of  selection  and  treating  the  mortality  rates  as 
a  function  of  the  age  alone  is  that,  at  the  younger  entry 
ao-es,  premiums  are  underestimated   and   annuity-values  are 
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overestimated."^  The  0^^^'  premiums  should^  properly  speaking, 
be  compared  with  those  derived  from  a  table  representing  the 
true  aggregate  of  the  select  tables,  but  no  such  table  is  avail- 
able. There  is  a  point,  which  is  in  general  somewhat  greater 
than  the  average  age  at  entry,  at  which  the  two  curves 
representing  the  premium  values  for  the  mixed  and  select 
data  cross  each  other,  and  for  the  older  ages  the  premiums  by 
mixed  tables  are  greater  than  those  by  the  select  table.  The 
extent  of  the  differences  in  the  premiums  is  sufficient  to 
render  it  necessary,  in  adopting  a  basis  for  assurance 
premiums,  to  take  into  account  the  question  of  selection.  The 
only  plan  by  Avhich  the  use  of  select  tables  can  safely  be 
avoided,  is  either  by  adopting  a  special  form  of  loading 
or  by  throwing  out  altogether  from  the  data  upon  which  the 
premiums  are  based  those  years  of  assurance  which  are 
seriously  affected  by  selection,  that  is  to  say  by  employing 
a  table  of  the  H-*^  "'^  or  O'"'^  type.  We  then  obtain  a  table 
which  at  all  ages  overestimates  the  values  of  the  premiums 
and  underestimates  the  values  of  annuities. 

A  table  representing  "  ultimate  "  rates  of  mortality,  that 
is,  of  the  11^ ■■^^  or  0^^'"^  type,  is  therefore  a  safe  one  to  employ 
for  the  grant  of  assurances,  although  not  for  the  grant  of 
annuities.  There  is,  indeed,  very  much  to  be  said  for  the  use 
of  a  table  of  that  kind  for  assurance  purposes,  but,  to 
discuss  that  question,  we  should  have  to  go  into  the  finance  of 
life  assurance  valuations,  which  hardly  comes  within  the 
scope  of  our  subject. 

With  a  view  of  avoiding  the  necessity  for  select  tables,  a 
device  was  adopted  by  the  American  offices  in  their  first 
experience  denominated  the  "final  series"  method.  Tlie 
object  was  to  produce  a  table  not  entirely  unaffected  by 
selection,  but  in  which  its  influence  would  be  reduced  to  a 
minimum  ;  a  table  of  mortality  similar  to  that  which  might  be 
supposed  to  prevail  in  an  office  of  great  age  doing  a  uniform 
and  steady  new  business.     To  produce  that  result  the  lives 

*This  is  shown,  in  the  table  above,  to  l)o  tlie  case  both  with  the  IP^  and  U-^' 
Tables.  Unfortunately,  however,  in  neithei-  case  is  the  coini)ari80u  very 
satisfactory.  Dr.  Sprague's  H'^'i  ])remiunis  from  the  method  of  their  calculation 
are  ])robably  somewhat  hij^her  tlian  the  true  values,  and  in  the  case  of  the 
O*'  Table  we  are  comparing  select  premiums  based  in  part  upon  the  aggregate  of 
the  select  tables,  excluding  lirst  ten  years  from  entry,  with  0^^  premiums  based 
upon  an  aggregate  table  from  which  there  had  been  a  further  elimination 
of  duplicate  assurances. 

G    2 
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existing  at  the  close  of  tlie  observations  were  traced  out 
through  a  hypothetical  future  in  which  they  were  assumed  to 
be  subject  to  rates  of  mortality  and  lapse  identical  with 
the  rates  actually  observed  in  the  past  among  lives  of  similar 
age  and  duration.  The  minor  details  of  the  pi-ocess  we  may 
pass  over.  The  result  from  a  financial  point  of  view  is  that 
the  premiums  are  still  underestimated  for  the  younger 
insuring  ages^  although  not  to  the  same  extent  as  in  a  table  of 
the  H^  type,  and  are  overestimated  at  the  older  ages,  the 
point  at  which  the  values  cross  the  true  curve  being  earlier 
than  would  have  been  the  case  had  the  "final  series" 
adjustment  not  been  used.  There  are  some  practical 
difficulties  in  adopting  a  method  of  this  kind.  One  of  these 
is  that  after  some  15  or  20  years'  duration  the  observed  rates 
of  mortality  for  individual  ages  and  years  of  assurance 
depend  on  a  very  few  facts.  We  then  have  to  apply 
the  very  irregular  rates  resulting  from  those  iew  facts 
to  much  larger  numbers,  including  the  existing  lives  that 
have  been  brought  back  hypothetically  under  observation ; 
so  that  where  these  irregularities  become  inconveniently 
large,  the  application  of  the  method  must  cease;  or  else 
these  irregular  rates  must  be  subjected  to  some  process  of 
graduation  before  being  used  in  the  calculations. 

This  difficulty  could  be  met  by  using  a  species  of 
QMdr.)  Qj.  QM(2o)  rj^^^iQ  fQj.  j.-g],g  Qf   15  Qj.  20  years'  duration  and 

upwards,  instead  of  the  rates  of  mortality  deduced  from 
individual  years  of  assurance.  There  are,  however,  other 
objections  to  this  method  as  an  expedient  for  counteracting 
the  effect  of  a  too  short  average  duration  of  assurance. 

As  the  rate  of  mortality  amongst  assured  lives  cannot 
strictly  be  treated  as  a  function  of  age  alone,  but  is  also 
dependent  upon  the  duration  of  assurance,  so  the  rates  of 
sickness  in  a  Friendly  Society,  or  of  re-marriage  in  a 
Widow's  Fund,  are  affected,  respectively,  by  the  duration 
of  membership,  or  of  widowhood.  Sufficiently  approximate 
results  may,  however,  be  generally  arrived  at  in  these  cases 
by  treating  the  rate  of  sickness,  or  of  re-marriage,  as  a 
function  of  the  age  alone :  in  the  former  case  because 
the  effect  of  selection  is  not  very  great  and  is  soon  exhausted, 
in  the  latter  case  because  the  average  constitution,  as  regards 
the  duration  of  widowhood,  of  a  group  of  lives  passing  under 
observation  at  a  given  age  will  be  found  to  remain  fairly 
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constant  (unless  the  Pension  Fund  is  of  recent  establishment) 
and  the  financial  effect  o£  a  marriage  when  it  occurs  is  a 
function  of  the  age  only. 

Where,  hoAvever,  we  are  dealing  with  rates  of  dis- 
continuance or  lapse,  it  is  important  that  these  should  be 
analyzed  both  as  respects  age  and  duration.  Owing  to  the 
fact  that  the  financial  effect  of  a  discontinuance  is  mainly 
dependent  upon  the  duration  of  assurance,  very  erroneous 
conclusions  may  be  deduced  by  treating  the  rates  as  functions 
of  the  age  alone  as  has  sometimes  been  done.  If  this  course 
is  adopted  special  precautions  must  be  taken,  such,  for  example, 
as  deducing  the  rates  from  a  body  of  lives  representing  the 
"  existing  "  some  10  or  20  years  back,  and  excluding  from  the 
"  exposed  to  risk  "  all  more  recent  entrants,  as  proposed  by 
Mr.  A.  W.  Watson  {J.I.A.,  xxxv,  313-4). 


SIXTH   LECTURE. 


I 


N  the  concluding  Lecture  we  shall  deal  ^vith  some 
miscellaneous  points  of  general  interest  or  arising  out  of  the 
previous  Lectures.  We  have  already  dealt  with  the  nature 
of  the  modifications  of  Makeham's  formula  for  the  force  of 
mortality,  necessary  to  enable  us  to  represent  satisfactorily 
the  mortality  shown  by  select  tables  such  as  the  0^^^^. 
These  modifications  consisted  in  treating  the  quantities  A 
and  B  or  a  and  yS  in  the  formulas 

/am+<  =  A  +  B.c-^-+^;  cologioPM+«  =  a  +  /3.c-^+^ 

which  are  constants  as  regards  the  variable  x,  as  functions  of 
t  the  time  elapsed  since  the  date  of  selection. 

It  is  clear  that  a  similar  course  may  be  pursued  if  any 
other  formula  than  Makeham's  is  employed  in  the  graduation 
of  the  "  ultimate  "  table.     Thus  we  may  write 

where  A^  and  B,;  will  in  general  be  such  functions  that,  as  t 
reaches  a  certain  value,  at  which  the  select  and  ultimate 
mortality  rates  merge.  A,;  becomes  zero  and  B^  unity.  The 
form  of  these  expressions  employed  for  representing  the 
effect  of  selection  suggests  that  a  similar  form  may  be 
employed  for  representing  rate  of  discontinuance,  which  in 
general  may  be  taken  to  be  a  function  of  the  duration  of 
assurance  and  of  the  age  at  entry.  The  same  remark  applies 
to  such  a  function  as  the  rate  of  remarriage  amongst  widows, 
which  is,  similarly,  a  function  of  the  duration  of  widowhood 
and  of  the  age. 


87 

Altlioucrli  we  have  dealt  at  considerable  leno-tli  with  the 
use  of  Makeham's  formula  in  connection  -with  mortality 
tables^  there  are  some  further  remarks  to  be  made  as  to  its 
employment  in  certain  special  cases,  more  particularly  in 
connection  with  the  age  statistics  at  a  Census. 

If  we  suppose  a  population  which  is  (1)  subject  to  uniform 
rates  of  mortality,  corresponding*  at  the  adult  ages  to 
Makeham's  formula,  (2)  such  that  the  numbers  living- 
represent  the  survivors  from  a  number  of  births  increasing 
annually  in  a  geometrical  progression,  and  (3)  is  subject 
to  a  rate  of  emio-ration  or  immio-ration  uniform  at  all 
ages,  then  if  Z'^.  represent  the  numbers  in  the  population, 
at  a  given  moment  of  time,  passing  through  the  exact  age  x, 
obviously  the  curve  of  V x  will  follow  Makeham's  formula, 
and  if  we  write 

d      , 
di      "^ 


/' 


X 


we  shall  have  a  formula  similar  to  the  usual  formula  for  the 
force  of  mortality,  but  with  the  constant  A  increased  by  r, 
the  rate  per  annum  at  which  the  population  is  increasing; 
that  is  to  say,  the  "  natural  "  rate  of  increase  less  the  rate 
of  emigration.  It  is  true  that  hardly  any  population 
will  be  found  to  conform  very  closely  to  the  above 
assumptions,  but  nevertheless  it  will  be  frequently  found 
that  the  population  curve  for  the  adult  ages  does  conform 
to  Makeham's  formula  for  h-,  although  in  most  cases  it 
will  be  necessary  to  adopt  Makeham's  second  development 
of  Gompertz,  with  the  additional  constant  in  the  expression 
for  /jijc. 

If  the  population  is  given,  as  is  usual,  for  decennial  age 
groups  {e.g.,  15-25,  25-35,  35-45,  &c.),  the  values  of  the 
ordinate  for  the  middle  age  of  each  group  may  be 
obtained  with  sufficient  approximation  by  deducting  from 
each  term  Uj.  of  the  series  repi^esenting  the  numbers  in 
successive  age  groups  one  twenty-fourth  of  the  central 
second   difference 
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From  tlie  values  of  I'x  thus  obtained,  by  writing 

log  Z'a;=K  +  s.;r  4-gr.c^, 

or,  log  l'x  =  ^  +  s.x  +  h.x^  +  g.c^ 

as  the  case  may  be,  the  constants  may  be  determined  as  for  a 
mortality  table. 

Take,  for  example,  the  male  population  of  England  and 
Wales,  enumerated  at  the  Census  of  1901,  as  under  : — 


Table  XIL 
Jifale  Population  in  Age-groups :    England  and  Wales,  1901. 


Cpiitral 

Age 
Group 

Numbers* 

Ordinate 
"-         24 

log  (3) 

Alog(3) 

A- log  (3) 

A3  log  (3) 

Col.  (4) 
Adjusted 

(1) 

(2) 

(3) 

C4) 

(5) 

(6) 

(7) 

(8) 

15-25 

94,693 

25-35 

76,425 

76,373 

4^8829 

-•1093 
•1415 

4-88349 

35-45 

59,394 

59,371 

4-7736 

-•0322 

-•0138 
-  -0485 
-•0987 

4-77301 

45-55 
55-65 

42,924 
27,913 

42,863 

27,838 

4-6321 
4-4446 

-•1875 
-  ^2820 
-•4752 

-  ^0460 
-0945 

4-63269 
4-44401 

65-75 

14,091 

14,541 

41626 

-•1932 

416319 

75-85 

5,080 

4,868 

3-6874 

3-68681 

85-aud 

552 

OVPV 

*To  reduce  the  magnitude  of  these  numbers,  the  figures  used  are  those 
corresponding  to  a  total  population  (M  &  F)  of  1,000,000  as  given  in  the  Census 
Report.  This,  of  course,  does  not  affect  their  relative  value  nor  the  form  of  the 
curve. 

Here,  evidently.  Col.  (6)  cannot  be  well  represented  by  a 
Geometrical  Progression,  but  with  Col.  (7)  this  is  possible 
without  very  serious  changes  in  the  values.  This  would  give 
a  formula  corresponding  to  Makeham's  second  modification  of 
Gompertz,  viz., 

log  Z'.^.  =  K  +  A.T  +  A' .  a;2  +  B .  c^ 

for  the  values  of  the  logs  of  the  numbers  living  at  age  x, 
given  in  Col.  (4).  As  these  numbers  are  only  approximate, 
and  our  object  is  merely  to  show  the  applicability  of  the 
formula  as  a  base  line,  we  may  adopt  a  very  simple  method 
of  determining  the  constants,  similar  to  that  used  by 
Mr.  Makeham  in  his  paper  on  the  Law  of  Mortality  {J. I. A., 
xiii,  p.  338  et  seq.).     If  the  terms  in  Col.  (4)  are  alternately 
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diminished  and  increased  by  a  quantity  z,  the  quantities  in 
Col.  7  Avill  become 

-•0138 +  8;^ 

-•0485-82 

-•0987  +  83 

These  terms  can  obviously  be  made  to  form  a  geometrical 

progression  by   suitably  determining   z,  and    their    common 

ratio,  found  by  dividing  the  sum  of  the   second  and  third 

terms  by  the  sum  of  the  first  and  second,  will  be  equal  to 

14.7-^ 

^—^  =  2-363. 

Dividing  the  sum  of  the  first  two  terms  by  3-363  we  get 

-0623 

Q.Q/^o    =   —"01853    as     the     adjusted     first    term,    giving 

82=— 47'3  and  2=— 59.  Hence  the  transformed  series  for 
Col.  (4)  is  as  shown  in  Col.  (8),  where  the  progression 
accurately  follows  Makeham's  second  development. 

It  is  on  the  whole  more  convenient  to  deal  with  the 
numbers  living  above  age  x  rather  than  the  numbers  for 
the  decennial  age  groups. 

If  we  treat  the  numbers  in  Table  XII  in  this  manner, 
representing  the  numbers  living  above  age  x  by  the 
expression 

log  Qx  =  K  +  7?m-^  4- «  &^ 

we  shall  have  the  results  set  out  in  the  following  table,  where 
the  values  of  the  constants  have  been  determined  by  ignoring 
the  extreme  values  of  log  Q^  at  ages  15  and  85,  and  equating 
the  sums  of  the  values  of  the  above  expression  to  the  values 
of  OogQ25  +  logQ:«),  (logQ35  +  logQ,3),  &c.,  by  which  means 
we   obtain    for    the  values  of  the  constants 

log  a  =  -006420  (ma^^)  = -1-0582 

log  &= -035184  (n7/")  =—  -007933 

K  =  6-4222 

The  five  figure  logaritlims  of  Q^.  were  employed  in  the 
calculation,  but,  owing  to  the  nature  of  the  process,  the  fifth 
figure  in  the  graduated  column  cannot  then  be  relied  upon ; 
the  logs  have  therefore  been  throughout  cut  down  to  four 
figures  in  the  table,  which  is  quite  sufficient  for  the  purpose 
of  illusti'ation. 
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Table  XIII. 

Male  Population  living  ahove  the  undermentioned  ages. — England 

and  Wales,  1901. 
{Based  upon  figures  in  preceding  Tahle.') 


Proportional 

logQ'x 

logQ'«- 

-logQx 

Age 

Numbers 

logQ:c 

AlOg  Q;, 

AlogQ'^ 

X 

Qx 

K  +  ma^  +  nh^ 

+ 

— 

15 

321,672 

5-5074 

-   -1514 

5-5059 

-•1498 

... 

•0015 

25 

226,979 

5-3560 

-   -1783 

5-3561 

--1785 

•0001 

... 

35 

150,554 

5-1777 

-  -2179 

5-1776 

-•2177 

... 

•0001 

45 

91,160 

4-9598 

-   -2764 

4-9599 

-•2766 

•0001 

... 

55 

48,236 

4-6834 

-   -3754 

4-6833 

--3753 

... 

•0001 

65 

20,323 

4-3080 

-   -5573 

4-3080 

-•5574 

... 

... 

75 

5,632 

3-7507 

-1-0088 

3-7506 

-•9218 

... 

-0001 

85 

552 

2-7419 

2-8288 

•0869 

... 

The  practical  identity  of  the  curves  at  all  ages  except  15 
and  85,  -which  values  were  not  used  in  determining  the 
constants,  suggests  that  very  accurate  results  might  be 
obtained  by  making  nse  of  a  curve  of  the  above  form  for 
interpolation  of  intermediate  values  of  Qa?. 

It  has  been  proposed  to  employ  Makeham's  formula  to 
represent  the  curve  of  sickness  rates  at  successive  ages,  and 
this  has  been  done  with  a  certain  degree  of  success,  but  the 
practical  advantages  of  the  formula  as  applied  to  sickness 
rates  are  not  very  apparent,  as  it  is  usually  necessary  to 
know  not  merely  the  total  sickness  rate  at  each  age  but  its 
division  into  sickness  of  various  durations,  as  the  number  of 
weeks  per  annum  during  the  first  six  months  of  illness,  from 
the  sixth  to  the  twelfth  month,  after  the  twelfth  month, 
&c.  As  Makeham  shows  {J.I. A.  xvi,  414),  the  ratio 
Weeks  sickness  experienced  in  the  year  of  age 
Exposed  to  risk  in  middle  of  year  of  age 
is  not  a  function  similar  to  /i.^  but  to  q^,  since  it  has  a  definite 
limit,  namely,  52,  or  1  if  the  sickness  is  expressed  in  years  in 
lieu  of  weeks.  Hence  if  we  represent  the  above  ratio  by  the 
symbol  cS^,,  we  should  write 

log(52-s^)=A  +  B.c-^. 
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Where,  by  the  constitution  of  a  societj^  there  is  no  formal 
superannuation,  the  sickness  benefit  continuing  throughout 
life,  it  is  almost  invariably  the  practice  of  actuaries  in  using 
Sickness  Tables  for  the  purpose  of  computing  contributions 
or  valuing  benefits  to  assume  that  the  so-called  "  sickness  " 
will  become  chronic  after  a  certain  age,  70,  75,  or  80.  In 
such  cases,  as  the  rates  of  sickness  actually  employed  ^vill 
generally  be  much  below  the  maximum  of  52  weeks,  we  may 
use  log  (N—Sx)  =  A  +  Bc-^ .  The  value  of  N  must  be  determined 
by  trial. 

Mr.  King  has  given  an  example  in  the  graduation  of  the 
values  in  the  Text-book  mortality  table  at  the  youngest  ages 
of  a  further  application  of  Makeham's  formula,  the  term  Bc-^ 
in  the  expression  for  the  force  of  Mortality  representing,  of 
course,  equally  well  an  increasing  rate  of  mortality  as  in  adult 
life  or  a  diminishing  rate  as  in  infancy  and  childhood. 

In  the  common  case  of  an  asymmetrical  series  the  terms 
of  which  become  zero,  or  very  nearly  so,  at  each  end,  the 
following  method  of  employing  the  "normal'^  frequency 
curve  to  represent  the  series  will  often  be  foimd  convenient 
and  effective,  particularly  if  the  data  are  presented  in  the 
form  of  a  few  groups.  Let  the  successive  ordinates  of  the 
curve  be  represented  by  the  equation  y=f(^x);  we  shall 
assume  the  total  area  of  the  curve  to  be  unity  and  the  area  of 
curve  between  the  limits  cr=  00  and  x  =  t  will  be  {^ydx.  Let 
us  write 


Yt=\   ydx=-^A   e-''dt 


so  that  Y„  =  1  = 


"  / 

V  TT 


e-'^dt 


where  z  is  a  function  of  t,  the  form  of  wliich  is  to  be 
determined  by  the  data.  For  most  purposes  it  will  be 
sufficient  to  treat  z  as  a  parabolic  function  of  t,  but  it  will  be 
seen  later  that  there  are  certain  cases  in  which  a  different 
hypothesis  as  to  the  form  of  the  function  z  is  to  be  preferred. 
An  example  will  make  plain  the  method  of  proceeding. 
Take  the  0^'  data  as  summarized  on  p.  viii  of  the  volume 
of  Unadjusted  Data  (Whole-life,  Males).  In  the  last  two 
columns  of  the  table  tliere  is  given  the  ''proportionate 
distribution  per-cent "  of  the  exposed  to  risk  and  died. 
Taking  tlie  figures  there  given  we  obtain  the  following  tables. 
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The  values  of  z  are  found  by  entering  a  table  of 


1 


x/ 


TT 


l-'\li 


for  +  and  —  arguments  with  the  values  in  the  second 
columns  of  Tables  (XIV)  and  (XV) .  We  may  employ  a  table 
such  as  that  given  by  Woolhouse  (J. I. A.,  vol.  xvii,  p.  50)  or 
that  given  on  pages  138, 139,  at  the  end  of  these  lectures.  Note, 
however,  that  in  each   of  these   tables  the  function  tabulated 

is    -y-    e~'-''dt,  say  I^,  for  +  arguments  only,  so  that  the  total 

area  of  the  curve  from  -co  to  +go  is  2  instead  of  1.  Hence, 
if  Y^  is  >  ^  we  must  put 


_1       1 


,  2      2  -s/ttJ  0 


t"- 


dt 


so  that  z  takes  the  value  corresponding  to  the  tabular  value 
lz  =  2Yt—l.  Similarly,  if  Y^  is  <^  we  put  2- negative  and 
numerically  equal  to  the  argument,  giving  l2=l  — 2Y^. 


Table  XIV. 
0^^  Data.     Exposed  to  Rislc. 


Age 
t 

0 
10 
20 
30 
40 
50 
60 
70 
80 
90 

Proportion 

Exposed  to  Risk 

above  age  t 

1      r^ 
-       ^-      e-t'-dt 

Values 
of 

2 

Az 

A-3 

A^z 

A^z 

A^'z 

1-00000 
•99991 
•99584 
•90060 
•65989 
•39810 
•18795 
•05951 
•00927 
•U0039 

00* 

2^6500 

1-8660 

-9086 

•2915 

-  ^1826 

-  ^6261 
-11023 
-1-6650 
-2-3750 

-•7840 

-  -9574 
--6171 
-•4741 
-•4435 
-•4762 

-  -5627 
-•7100 

•3403 

•1430 

•0306 

-•0327 

-  -0865 

-1473 

-•1973 
-1124 
-•0633 
-  ^0538 
-•0608 

•0849 

•0491 

•0095 

--0070 

-•0358 
-•0396 
-  -0165 

*  Theoretically  the  values  of  z  corresponding  to  a  total  frequency  of  1  and  0 
are  respectively  ±oo.  As  however  s=±3  corresponds  to  Y  =  ^999989  or 
•000011,  «=  ±3-5  to  Y  =  ^99999963  or  ^00000037,  and  s=  ±4  to  Y  = '999999992 
or  -000000008,  it  will  he  seen  that  any  value  of  z  over  3  will  sufficiently  represent 
the  complete  distribution  or  the  zero  value,  and  in  practice  it  would  be  quite 
sufficient  to  insert  at  the  ends  of  the  table  any  convenient  value  of  s  over  3, 
and  consistent  with  the  general  run  of  the  intervening  terms. 
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Table  XV. 


0^^  I>ata.     Deaths. 


Age 
t 

0 

10 
20 
30 
40 
50 
60 
70 
80 
90 

Proportion  of 

Deaths 
above  age  t 

1           Z'' 

Values 
of 

z 

Mean 
Error  of 

in  last 

place 

of 

decimals 

As 

A-s 

A^*- 

A'z 

A'z 

1-00000 
1-00000 
•99925 
-97565 
•88854 
•74174 
-53731 
-28908 
-08169 
-00590 

00  * 
00  * 

2-2450 

1-3939 

-8618 

-4587 

•0663 

-  -3932 

-  ^9856 
- 1-7806 

±192 
±   48 
±   28 
±   24 
±   23 
±   24 
±   32 
±  82 

-•8511 
--5321 

-  -4031 
--3924 

-  -4595 
--5924 
--7950 

-3190 

•1290 

■0107 

-•0671 

--1329 

--2026 

--1900 
--1183 
--0778 
--0658 
-•0697 

•0717 

•0405 

•0120 

-  ^0039 

-0312 

-•0285 
-0159 

*  See  note  at  foot  of  Table  XIV  on  precediiicr  page.  It  is  to  te  noted,  that  in 
lieu  of  the  integral  of  the  normal  frequency  function,  the  function  e^/(l  +  e') 
may  be  used,  leading  to  a  method  of  procedure  similar  to  that  referred  to 
on  p.  51. 


The  column  containing  the  mean  error  or  standard 
deviation  of  z  in  the  table  of  deaths  is  computed  as  follows. 
If  the  total  of  the  series  (in  this  case  the  total  deaths)  is  n, 
and  the  total  above  a  given  point  (in  this  case  the  number  of 
deaths  above  age  €)  is  tn,  then  the  mean  error  in  m  is  equal 

\in.  (?(,  — m)       -r^  .  .  ^  1     T        1      1 

to    A/ .     J?  rom    this    can    be    calculated    the   mean 

errors  of  the  values  in  column  (2) .  The  change  in  the  value 
of    z    corresponding    to    a   given    change   in   the   values    of 

—1—     e~^'dt  in  column  (2)  being  known  from  the  table  of  this 

function  we  obtain  the  values  in  column  (4).  These  standard 
deviations  are  not  inserted  in  the  table  of  Exposed  to  Kisk^as 
the  principle  upon  which  the  mean  errors  in  the  proportionate 
distribution  of  the  deaths  are  computed  is  not  strictly 
applicable  to  the  table  of  Exposed  to  Risk^  Avhen  the  latter 
represent  observations  spread  over  a  long  and  continuous 
period,  although  it  would  be  applicable  if  the  numbers  dealt 
with  represented  the  exposures  in  a  single  calendar  year. 

If  we  examine  the  columns  of  the  successive  differences 
of    z   in    the    two    tables,   ignoring  the    infinite  values  of  z 


94 

corresponding  to  a  total  distribution  of  unity  we  shall  see 
that  they  exhibit  a  remarkable  similarity  in  the  nature  of 
their  progression,  especially  from  the  columns  Ah  onwards. 
It  will  also  be  apparent  that  a  very  small  alteration  of  the 
orio-inal  values  of  z  in  either  table  would  be  needed  to  make 
the  fifth  differences  constant ;  that  is,  we  may  assume  without 
serious  error  that 

z  =  a  +  bt  +  c. -^ — -  +&c. 

In  order  to  obtain  the  closest  agreement  with  the 
original  facts  due  regard  would  have  to  be  taken  of  the 
Aveights  corresponding  to  the  mean  errors  in  the  value  of 
z  as  given  in  the  table.  But  we  shall  obtain  results  quite 
good  enough  for  all  purposes  by  the  following  simple 
procedure.  It  will  he  observed  from  the  values  of  the 
mean  errors  that  the  values  of  z  for  ages  40  to  70  have 
approximately  the  same  weight,  those  for  ages  30  and  80 
have  somewhat  less  weight  and  finally  those  for  ages  20 
and  90  much  less. 

If  we  combine  the  values  of  z  in  sets,  thus, 

Z20+3z3o  +  24o;      2^30  +  3^40  +  250;      &C., 

with  their  corresponding  numerical  values  we  shall  obtain 
six  equations  to  determine  the  six  coefficients,  a,  h,  c,  .  .  ./. 
Into  these  equations  the  values  Zao  and  z^  will  enter  once, 
the  values  2:30  and  Zgo  four  times,  and  the  remaining  values 
five  times.  We  need  not  compute  the  numerical  values 
for  all  these  equations  as  it  will  be  evident  that  if  we 
write  them  down  and  difference  them  we  shall  arrive  at 
the  following : 

5ft  +  5&  +  c—  2-20 +3230  +  ^40  =  7-2885 
5Z;  +  5c  +  cZ  =  A  (200  +  3230  +  240)  =  -  2-8505 
5c  +  5cZ  +  e  =  A2(2oo  +  8230  +  ^40)  =  -7167 
.  5cZ  +  5e +/  =  A3(z2o +  3.^30  + ^40)  =-  '6227 
5*^  +  5/  =A^(,-,^  +  32-3o  +  ?:4o)=  -2052 
5/  =A?{z^  +  ^z^  +  z^  =  -   -1326J 

From  these  equations  the  values  of  /,  e,  d,  &c.,  can  be 
obtained  with  great  facility. 
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Having  obtained  a  formula  for  z  in  terms  of  t,  we  can  now 
obtain  any  term  in  the  series  and  can  also  obtain  the 
value  of  y,  the  ordinate  representing  the  number  of  deaths  at 
age  X  {i.e.,  approximately  between  ages  x—^  and  x+h)  since 


y= 


f,   dz 
dt 


and 


—  =A^/—    ^rzf+  -  A^iv—    A^^/+     A^^.. 
dt  ^     2     ^^3     ^     4     ^^5     ^ 


It  will  generally  be  sufficient  to  compute  the  values  of  y 
for  decennial  or  at  most  quinquennial  intervals  and  to 
interpolate  the  resulting  values  of  qx  or  vi^  for  the  inter- 
mediate ages. 

The  values  of  the  quantities  a,  h,  c,  &c.,  satisfying  the 
above  equations,  are 


a=     2-24374 
h=-   -849365 
c=       -316624 


d= --186796 
e=  -067560 
/=- -026520 


It  may  be  of  interest  to  give  the  adjusted  values  of  z  and 
the  distribution  of  deaths  corresponding  to  these  which  are  as 
under : 

Table  XVI. 

0^^  Data.     Deaths. 
Adjusted  values  of  z  and  adjusted  distrihuiion  of  Deaths. 


1      2 

Last  column  more  (  +  )  or 

Age 

z 

less  (  — )  than  corresponding 

column  in  Table  (XV). 

+ 

0 

6136-15 

1-00000 

•  »• 

10 

3-69061 

1-00000 

... 

20 

2-24374, 

•99925 

30 

1-39438 

•97569 

•00004 

... 

40 

•86166 

•8884.8 

•0OOO6 

50 

•45872 

•74174 

•ooooo 

60 

•06640 

•53740 

•00009 

... 

70 

-   -39352 

•28893 

... 

•00015 

80 

-   -98173 

•08188 

•00019 

90 

-1-78289 

•00585 

... 

•0OO05 

100 

-2-U022U 

•OO0O2 

... 

... 
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The  principal  objection  to  this  adjustment^  paradoxical  as  it 
may  sound,  is  that  it  too  closely  follows  the  original  facts,  the 
deviations  being  very  much  smaller  than  the  probable  errors 
of  the  observations.  This  is,  of  course,  due  to  the  fact  that 
we  have  included  too  many  constants  in  our  formula.  A 
constant  fourth  difference  in  the  values  of  z,  however,  mav 
lead  to  anomalous  results,  and  a  constant  third  difference 
makes  the  errors  of  adjustment  too  great.  The  best  plan  in 
such  a  case  would  be  to  adjust  the  exposures  by  using  a 
constant  third  difference,  to  recompute  the  deaths  to 
correspond  to  the  adjusted  exposures  in  the  10  year  groups 
and  then  employ  a  constant  third  difference  for  the  graduation 
of  the  death  curve.  Or,  as  an  alternative,  an  expression  for 
z  may  be  assumed  of  the  form 


z  =  -k  + 


m 


+ 


n 


a-\-x      b  +  x 


and  the  values  of  k,  m,  n,  a,  h,  determined  by  weighting  the 
equations  in  a  manner  similar  to  that  shown  above  for  the 
fifth  difference  curve. 

We  have  used  the  0^'  data  to  illustrate  the  above  process, 
but  generally  speaking  the  latter  will  be  found  more  useful 
where  the  data  are  only  available  in  large  groups,  and,  in 
particular,  where  the  limits  of  the  series  are  not  well  defined. 

In  the  following  table  we  have  a  statement  taken  from 
Supplement  to  the  Registrar-General's  45th  Annual  Report, 
p.  cxviii,  showing  the  number  of  Innkeepers,  &c.,  living  at 
or  over  certain  given  ages. 

Table  XVII. 
Innkeepers,  Publicans,  6fe.   (1881). 


Ages 
t 

Living 

above  age 

t 

Proportional 
numbers 

J   GO 

Values  of 

15 

232,890 

10000 

00 

20 

230,280 

•9888 

1-6147 

25 

222,213 

•9542 

1-1929 

45 

105,153 

•4515 

-   •0862 

65 

14,451 

•0620 

-1-0877 

It  will  be  seen  that  more  than  50  per-cent  of  the  numbers 
living  are  in  the  age-group  25-45,  and  nearly  40  per-cent  in 
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the  group  45-65.  In  such  a  series  the  usual  methods  of 
interpolation  would  probably  give  unsatisfactory  results. 

If  we  treat  the  values  of  z  as  having  constant  third 
differences,  we  obtain  the  following  equations,  taking  five 
years  of  age  as  the  unit — 

a  =1-6147 

a-^h  =1-1929 

a  +  5/;  +  10c  +  10cZ=  - -0862 

a  +  9&  +  36c  +  84cZ=- 1-0877 

h,  c,  and  d  are  the  values,  reckoning  from  age  20,  of  the 
differences  of  z.  The  values  of  a  and  h  are  e-iven  immediateJv 
and  solving  the  remaining  equations  for  c  and  d  we  obtain — 


«=  1-6147 


c  = -04863 


&=--4218 


d=_-00782 


which    enable   us  to    form   at   once  the  following    series   of 
xj^uinquennial  age  groups. 

Table  XVIII. 
Innkeepers,  Publicans,  Sfc.   (1881   Census). 


Age 

Interpolated 
Values  of 

Corresponding 

Values  of 
1      2 

Proportional 
Population  between 

t 

Z 

* 

Age 
t  and  {t  +  5) 

15 

2-0930 

•9985 

97 

20 

1-6147 

•9888 

346 

25 

1-1929 

•9542 

774 

30 

•8197 

•8768 

1221 

35 

•4874 

•7547 

1499 

40 

•1880 

•6048 

1533 

45 

-   -0862 

•4515 

1376 

50 

-   -3429 

•3139 

1119 

55 

-   -5904 

•2020 

834 

60 

-   -8360 

-1186 

566 

65 

-1-0876 

-0620 

342 

70 

-1-3534 

•0278 

176-7 

75 

-1-6407 

•01017 

73-5 

80 

-1-9577 

-00281 

22-5 

85 

-  2-3121 

-fX)053 

4-7 

90 

-2-7117 

-00006 

•6         j 

RcjjreEcnting  the  populiitinii  living  above  age  x  otit  of  a  total  j)()imlati()ii  of  1. 
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It  will  be  seen  that  tins  distribution  shows  a  small  number 
of  cases  below  age  15.  This  may  be  avoided  if  it  is  desired 
to  commence  the  curve  at  and  not  before  that  age,  by 
writing 

m 


2  = 


t  —  15 


+  ai-ht-\-ct- 


giving 


=  h  +  2ct  — 


dt         '  (i-15)2 

m 


the  term  ^ — ,^  being  introduced  in  order  to  give  the  high 

t  —  J.  O 

values  of  z  required  near  the   origin,  or,  we  may  write  as 
suggested  above,  in  connection  with  the  0^  data. 


z 


,         m  n 

=  k+  -  . h 


a  +  a;      h  +  x 


the  value  of  a  being  taken  iu  this  case  as  equal  to  —15. 

This  form  for  the  value  of  z  will  be  found  very  convenient 
where  the  series  is  known  to  be  limited  in  either  direction  and 
the  number  of  groups  is  small.  In  certain  cases  either 
a  or  h  may  be  known,  and  we  have,  then,  only  four  constants, 
m,  n,  h,  and  h  or  a,  to  determine,  for  which  four  groups  will 
suffice.  Or  it  may  be  convenient  to  assume  values  for  both 
a  and  h,  in  which  case  with  four  groups  we  may  write 

t  +  a      1  +  0 
determining  m,  n,  k  and  c  from  the  data. 

In  the  case  of  any  statistics  intended  to  be  used  by  the 
actuary,  it  is  important  to  consider  not  only  how  far  they  are- 
suitable  for  the  purpose  for  which  they  are  to  be  employed, 
but  also  whether  the  data  are  sufficient  to  render  the 
conclusions  drawn  from  them  safe.  We  have  already  referred 
to  this  question  in  general  terms,  but  it  is  necessary  to 
consider  it  rather  more  closely. 

In  practice  the  actuary  has  to  deal  either,  (1),  with  tables 
based  upon  a  large  number  of  observations;  for  example, 
tables  such  as  the  0^^,  the  Government  Annuitants,  the 
Manchester    Unity    Tables    of     Sickness,    &c.,    where    the 
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accidental  errors  due  to  the  limited  numbers  are  practically 
insignificant,  but  where,  on  the  other  hand,  there  may  be 
uncertainty  as  to  the  suitability  of  the  experience  for  the 
case  in  hand  ;  or,  (2),  with  data  of  more  limited  extent  but 
known  to  be  applicable,  as  in  the  valuation  of  a  pension 
fund  of  a  Friendly  Society  by  tables  based  upon  its  own 
experience. 

In  the  latter  case  it  is  important  to  be  able  to  form  some 
judgment  as  to  the  extent  of  the  probable  errors  involved  in 
the  use  of  the  data  and  their  effect  upon  the  financial  values 
deduced  therefrom.  This  is  a  problem  not  susceptible  of  an 
exact  solution.  It  is  true  that  if  the  series  of  numbers 
representing  the  deaths,  marriages,  or  retirements,  as  the 
case  may  be,  can  be  represented  by  a  frequency  curve,  the 
probable  error  of  the  constants  may  be  obtained  in  the  manner 
shown  by  Professor  Karl  Pearson  in  his  paper  on  this  subject. 
But  these  results  will  be  little  practical  use  to  us,  as 
the  manner  in  which  these  probable  errors,  which  are  not 
independent,  will  affect  the  monetary  values  deduced  from 
the  graduated  rates  is  too  complicated.  We  can  only  deal 
with  the  problem  in  a  very  general  manner.  We .  are  not 
even  sure  that  the  ordinary  theory  of  errors  is  applicable  ta 
such  functions  as  rates  of  mortality,  sickness,  or  superan- 
nuation ;  indeed,  we  may  well  suspect  that  it  is  not  strictly 
applicable. 

If  the  probability  of  throwing  head  at  a  single  toss  of  a  coin 
is  one-half,  and  if  in  100  throws  54  heads  appear  to  46  tails,  we 
do  not  suppose  that  the  probability  of  the  average  number  of 
50  heads  appearing  in  the  next  100  throws  is  affected.  But 
in  the  case  of  the  probabilities  of  death  it  may  well  be  that 
an  abnormally  high  or  low  rate  of  mortality  in  a  given  year 
may  affect  the  probable  rate  in  succeeding  years,  and  that 
there  may  be  a  tendency  for  the  deviations  from  the  average 
result  to  correct  themselves,  a  low  rate  in  a  given  year 
leaving  a  larger  number,  and  a  high  rate  a  smaller  number, 
of  impaired  lives  surviving,  and  thus  changing  for  the  time 
being  the  constitution  of  the  group  under  observation. 

The  "  standard  deviation  "  in  the  value  of  a^  as  deduced 
from  a  given  experience  has  not,  that  I  am  aware  of,  been 
estimated.  It  will  be  instructive  to  attempt  this,  as  an 
example,  for  the  0*'(^>  table.  It  will  be  sufficient  to  use 
approximate  methods,  as  the  results  will   be   quite   accurate 

H  2 
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enough  for  our  purpose.      We  shall  assume  that  we  may  take 

C0\0geP  =  m=  —^~ 

and  that  if  the  standard  deviation  in  log  y  =  cr,    then   the 
standard  deviation  in  y  =  cry.* 

Taking  the  observations  at  a  given  age  x,  let  us  put 

exposed  to  risk  =^2- 

graduated  or  "  true  "  rate  of  mortality  =  q 

graduated  deaths  =nq       =0 

actual  deaths  =nq  +  z  =  d' 

observed  value  ox  a  =q         =(]_+- y 

where,  as  we  have  seen,  the   average   value  of  z  is  zero,  the 
average  value  of  z'^  =  nq{l  —  q),  &c.  {see  p.  110). 
Then  the  observed  value  of  m  =  m'  where 

^^'  —  — ''^^  +  ^ —  _  — i [_  (terms  in  powers  of  z) 

nq  +  z       ,       q 

=  m+fiz),  say 

=  colog,p+/(z) 

It  will  be  found  that  the  average  value  of /(z)  is  not  quite 
^ero  though  very  nearly  so,  being  equal  to  w^^^  +  ^A 
nearly,    a    quantity    that   may   be   neglected;    and  that  the 


vi^ 


average  value  of  [/(z)]^  is  —  very  nearly,  and 


m 


. /i 

V  average  value  of  [/(z)]^=  y 

Hence,  the  standard  deviation  in  the  "  central  "  death  (or 
marriage  or  secession  or  any  similar)  rate  is  very  nearly  equal 
to  the  rate  divided  by  the  square  root  of  the  number  of  deaths 
(marriages  or  secessions,  &c.).     The   errors  in  log^^?  are  of 

*If  logcey  liave  the  small  error  a,  y  will  be    changed  to  e'o8.j/+o-=^.eO- 
=_y(l  +  0-  +  .  .  .),  i.e.,  the  corresponding  error  in  y  will  be  ay  ncarlj-. 
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course  the  same,  but  of  opposite  sign  to  those  in  colog  ep. 
Let  the  observed  value  of  loge^.r  be  logep'x'     We  will  write 

'^Ogep'x  =  '^OgePx  +  ^lx 

where  u^  is  the  error  of  observation  whose  value  in  a  particular 
case  is  fixed  but  unknown,  the  average  value  over  a  long* 
series  of  similar  observations    being  zero,   and  the  average 

1       £    ->   1  •     (colog ep^y       (wj.)2     ,  .    ^, 

value    ot    ^rJ.    being  ^— ^ — —   or  '  :  where    no    is    the 

nqx  nq^. 

graduated  number  of  deaths  at  age  x. 

Taking  an  arbitrary  radix  for  our  mortality  table,  say  l-c, 

the  values  of  log  l-^+f  for  ages  above  a-  will  be 

logeZ'j.  =  loge^x 

logel'x+l  =  '^Ogelx  +  l  +  «x 


'i0gel'x+t  =  '^0geh-  +  f+{Ux+1lx+l  +  ---  +  Ux+f-i) 

similarly,  we  shall  have 

logD'x  =  logDj: 
and  for  higher  ages 

loge'D'x+t  =  'i^Oge'Dx+f+  (iix+  »x+i+  •  •  .  +  n.v+^_i) 

whence,  on  the  principle  of  approximation  laid  down  above, 

JJ  x  +  t         -L'x  +  C/i 

Summing  this  for  all  values  of  t  from  1  to  infinity,  we  shall 
have 

Here  the  quantity  in  the  bracket  in  the  numerator  is  the 
error  in  the  value  of  N'^  as  deduced  from  the  observations  in 
relation  to  the  value  of  D'^,.  corresponding  to  the  arbitrary 
radix  assumed  at  that  age.  The  average  value  of  each  term 
in  the  bracket  is  zero,  and  the  square  root  of  the  sum  of  the 
average  values  of  the  sfjuares  of  these  terms  divided  by  Dj. 
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will  give  the  standard  deviation  in  the  value  of  o! x  as  deduced 
from  the  data,  which,  omitting  the  suffix  x,  becomes 

1      . 

5j  ^Ju^m^  +  ti,2N,2  +  u^'&i  + ,  &c. 

If  the  mortality  table  be  graduated  the  standard  deviation  of 
the  graduated  values  of  a! x  will  be  somewhat  less  than  that 
of  the  ungraduated  values,  but  not  materially  less,  except  at 
the  ends  of  the  table,  the  principal  eifect  of  the  graduation 
being  merely  to  produce  a  smooth  progression  in  values. 

We  might  assume,  for  example,  that  the  effect  of  graduation 
was  about  equivalent  to  substituting  the  average  error  of  five 
successive  values  of  N'^-  for  the  error  of  the  middle  value. 
This  would  give  (omitting  a  quite  insignificant  term)  the 
expression 

for  the  error  in  the  graduated  value  of  a  ^  in  lieu  of  the 
expression  given  above. 

If  we  shorten  the  expression  for  the  standard   deviation 

of  a  X  from 

1      , 

-^^s/xi,m^^-u,m^^ui-^^^ ,  &c. 

to  its  approximate  equivalent  ; 

1      , . 

fr  \/hxii .  Na^  +  bu.^ .  N;^  +  Sitj./ .  N.o^  + ,  &c. , 

and,  further,  take 


25[colog,(p,)]2 


Observed  deaths  between  x  and  {x  +  5) 
we  shall  considerably  shorten  the  labour  of  calculation,  and 
at  the  same  time,  by  slightly  underestimating  the  required 
value,  make  a  rough  allowance  for  the  effect  of  graduation. 

We  are  now  in  a  position  to  compute  a  table  of  standard 
deviations  for  %.  for  quinquennial  intervals  of  age,  the 
principal  steps  of  the  working  being  set  out  in  the  table 
following.  The  final  columns  showing  the  mean  errors  or 
standard  deviations  in  the  value  of  ax  and  the  corresponding 
mean  errors  in  P.^  found  by  dividing  the  former  results  by  the 
quantity  (l  +  flx)^-* 

*  If  rt,r  liave  ail  error  0-^.,  then  Pj-  will  have  the  error 


Vl  +  a^;         /      M  +  «a;  +  (r.T;         /      1  +  "x       l  +  ax  +  tTx 


(Tr 


=  (^a:-^(l  +  a;c)-  nearly. 
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Table  XIX. 

Computation  of  the  standard  deviations  (Tx  in  the  deduced  values 
of  ax  and  ctx  -^  (l  +  '^'j;)^  ««  the  deduced  values  ofVx- 


Age 

25[colog,ii)x+.2]- 
xlOO 

Deaths 
between 

Ages 
X  and 

=  10-=(2)-(3) 

xlO-* 

1 

Sum  of 
last  column 

<Tx 

Dx 

(l  +  a.)2 

i+S 

=  (rx 

(1) 

(2) 

(3) 

(4) 

(5) 

(C) 

(T) 

(S) 

15 

•1020 

10 

1-020 

20244^ 

21570^ 

•2200 

•00038 

.  20 

•1113 

122 

•0912 

1163^ 

1326- 

•0653 

-00012 

25 

•1266 

924 

•01370 

IWO 

162-5 

•0274 

•00006 

30 

•1525 

3,072 

•004966 

24-48 

52-52 

•0187 

■00004 

35 

•1981 

5,689 

•003482 

1020 

28-04 

•0165 

•00004 

40 

•2813 

8,152 

•003451 

5-758 

17-84 

•0159 

•00005 

45 

•4410 

10,257 

•004295 

3^864 

1208 

•0160 

■00006 

50 

•7632 

12,620 

•006048 

2726 

8-215 

0164 

•00007 

55 

1^444 

14,903 

•009694 

1-9S6 

5-489 

•0169 

-00010 

60 

2-945 

16,618 

■01772 

l-4i5 

3-503 

•0177 

-00014 

65 

6-359 

17,455 

■03644 

•9770 

2-059 

•0187 

-00021  ' 

70 

14-32 

16,042 

•08929 

•6052 

1-082 

•0203 

•00033 

75 

33-20 

12,172 

■2728 

•3185 

•4764 

•0228 

■00059 

80 

78-51 

7,317 

1-0-3 

•1227 

•1580 

-0272 

•00116 

85 

188-1 

2,865 

6^o66 

•03151 

•03528 

■0364 

-00267 

90 

4o4^6 

692 

6571 

•003659 

•003775 

-0550 

•00705 

95 

1105^ 

86 

1285^ 

•000118 

-000118 

-0966 

1 

•02146 

The  result  we  have  arrived  at  shows  that  the  mean  error, 
or  standard  deviation,  in  the  values  of  the  3  per-cent 
Annuities  in  an  aggregate  experience  such  as  the  0^'''^  is 
about  one-fiftieth  of  a  year's  purchase  from  about  30  to  65 
years  of  age.  Owing  to  the  greater  number  of  deaths  at  the 
younger  ages  in  the  0^^  experience  this  would  about  represent 
standard  deviations  for  that  Table  from  25  to  65. 

If  we  suppose  an  experience  in  which  the  data  were 
one-hundredth  of  the  extent  of  the  0^'*^'^  but  similarly 
distributed,  it  is  obvious,  from  a  consideration  of  the  process 
by  which  the  above  result  was  obtained,  that  the  standard 
deviations  or  mean  errors  in  the  annuity-values  would  be 
ten  times  greater  than  the  values  found  above.  Hence,  with 
an  experience  including  about  1,000  deaths  distributed 
approximately  as  in  the  0^'  '^  data  the  deduced  annuity-values 
between  ages  30  and  60  would  on  the  average  be  uncertain 
to  about  +"20,  or  from  1  per-cent  to  1-^  per-cent  of  their 
values.  The  standard  deviations  above  obtained  Avould  be 
somewhat  reduced  in  a  small  experience  by  graduating  the 
experience  by  Makeham  oi-  Ijy  a  suitable  frequency  curve, 
but   not  very  materially.     It   would  occujiy   too   much   time 
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to  investigate  this  point,  but  we  may  easily  find  a  limit  to  the 
effect  of  any  possible  method  of  graduation  in  reducing 
the  standard  deviations  of  the  annuities.  In  any  ordinary 
experience  such  as  the  0^,  where  the  observed  deaths  are  a 
small  fraction  of  the  lives  passing  under  observation,  the 
errors  in  the  annuity-values  will  be  due,  l",  to  the  mortality 
on  the  whole  being  above  or  below  normal,  2",  to  the 
distribution  of  the  mortality  being  abnormal.  This  latter 
factor  can  alone  be  affected  by  any  method  of  graduation. 
Assume  it  to  disappear  altogether,  and  consider  the  standard 
deviation  for  say  a^o  (O**''^^  3  per-cent)  obtained  on  this 
hypothesis.  There  were  approximately  100,000  deaths 
observed  above  age  50  in  this  experience.  We  have 
\/l00,000  =  316    nearly,    and    if    we    assume    the   mortality 

above  50  to  be  throughout  subject  to  an  error  of   +  ^y^  of 

the   observed   amount,    this   will   be    equivalent    to   changes 

A  B 

of  +  ^czTTi  and  +  ,.^T7^  in  the  values  of  the  constants  A  and  B 
—  816  ~"  olb 

respectively,  which,  taking  the  value  of  A  =  "00589  and 
log  c  = '039,  are  equivalent  in  their  eff'ect  upon  the  annuity- 
value  to  a  change  of  '00186  in  the  rate  of  interest  per-cent 
and  of  '0341  years  in  the  age.  The  combined  effect  of  these 
changes  upon  the  annuity-value  at  age  50  is  equivalent  to 
+  •0148  as  compared  with  the  standard  deviation  of  '0185 
obtained  above.  The  very  considerable  standard  deviations 
at  the  ends  of  the  table  would,  however,  be  reduced  in  much 
greater  proportion. 

The  problem  dealt  with  above  is  not  the  same  as  that  of 
determining  the  standard  deviation  in  the  estimated  value  of 
an  annuity  on  a  single  life.  This  problem,  which  is  also  of 
importance,  has  been  dealt  with  by  Dr.  Bremiker  in  his  paper 
"  On  the  Bisk  Attaching  to  the  grant  of  Life  Assurances  " 
{J.I.A.  xvi,  pp.  216,  285).  As  this  paper  is  not  very  available 
for  students  and  the  notation  is  not  modern,  it  may  be  worth 
Avhile  to  give  the  following  short  demonstration.  For  the 
sake  of  simplicity  "  continuous  ^^  functions  are  used. 

If  the  annuitant,  aged  x  at  entry,  die  at  the  end  of  the 
time  t  the  loss  to  the  company  granting  the  annuity,  or  the 
deviation  from  its  mean  value,  referred  to  the  date  of  entry 

will  be 

.        ._A,-e-'« 
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and  the  sum  of  the  squares  of  all  values  of  this   quantity, 
multiplied  by  the  frequency  in  each  case,  will  be 

-Jo  (-^S— J  dt'^'P-^^'^^=  -Jo P dt  ^'P^^^^' 


Noting  that 


0  dt 


{tPx)dt  =  l 


-      e- 


(•00  7 

and  —      e-2'*  V,  itPx)dt  =  A'x  (at  rate  of  interest  =  e2*—l) 

Jo  dt    ^ 

we  obtain    from    the  above,   as   the    value    of    the    standard 
deviation  of  djc,  and  therefore   with   sufficient   accuracy    for 

practical  purposes  of  a^^  {  =  (7^—  -  nearly)  the  expression 

the  first  term  in  the  bracket  being  computed  at  the  rate  of 
interest  e^*  — 1,  and  the  second  at  the  rate  e^  — 1.  It  is  obvious 
that  the  standard  deviation  for  A^  will  be  the  above  expression 
multiplied  by  8  ;  and  for  A^;  less  the  capitalised  value  of  the 
annual  premiums  (P^;)  (which  Dr.  Bremiker  terms  the  "  Risk 
attaching  to  the  grant  of  Life  Assurances"  by  annual 
premiums)  the  risk  will  be  the  above  expression  multiplied  by 
(Pj  +  S).  The  premium  is  here  supposed  to  be  payable 
continuously  ;  if  an  ordinary  annual  premium  is  in  question, 
we  should  multiply  the  above  expression  for  cr  by  (Pa-  +  ti). 
The  arithmetical  values  of  these  "  risks  "  attaching  to  grant 
of  assurances  or  annuities  computed  at  4  per-cent,  according 
to  Heym's  mortality  table  (General  Widows  Fund  of  Berlin) 
are  given  in  the  paper  referred  to,  and  show,  as  is  obviousl}' 
the  case  from  general  considerations,  that  the  "risk",  or 
average  fluctuation  whether  profit  or  loss,  attaching  to  the 
grant  of  assurances  at  annual  premiums  is  considerably 
greater  than  that  attaching  to  their  grant  at  single  premiums. 
In  practice  the  important  question  for  a  life  office,  in  tliis 
connection — and  the  same  considerations  apply  to  other 
classes  of  insurance — is  the  average  amount  of  the  annual  (or 
quinquennial)  fluctuation  in  profit  due  to  the  deviation  of  the 
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death  strain  from  its  average  or  normal  amount.  In  a 
soundly  managed  office  these  fluctuations  never  approach  the 
point  at  which  stability  is  remotely  threatened,  but  they 
become  of  importance  when  they  are  sufficient  to  produce  any 
serious  variation  in  the  rate  of  Bonus. 

The  mean  square  deviation  of  e.„  will  be  found  by  putting 
B  =  0  in  the  expression  for  a^,  which  in  that  case  takes  the 

indeterminate  form  -  which  must  be  evaluated,  according  to 

the    rules     of    the    Differential    Calculus,    by    differentiating 

numerator  and  denominator.     The  resulting  expression  takes 

the  same  form,  so  that  the  process  must  be  repeated,  and  the 

limiting  value   of  the   expression  for  a''^  when   8  =  0  will  be 

found  to  be 

'  d-      ,         d^     -     ,~| 


T  1 

1^.5  =  02 


which  may  easily  be  reduced  to  the  form 

=  -]-ft'dh^t-(e.cy 

=  [mean  square  duration—  (mean  duration)^] 

This  being  the  mean  square  deviation  the  standard  deviation 
will  be 

a=^  [mean  square  duration— (mean  duration)^]* 

the    mean  deviation    irrespective    of    sign   is    approximately 

•798o-    and   the    probable    deviation    -6740-,    or    very    nearly 

A  2 

-or  and  ^a  respectively.*      [Of.  De   Morgan,   Encycl.    Metro- 

O  o 

politan.  Vol.  11,  p.  460,  Art.  149].  If  instead  of  a  single  risk 
the  average  of  n  risks  be  taken,  all  the  above  quantities  will 
be  divided  b}'  vn.  -.: 

*  The  exact  values  for  the  mean  deviation  irrespective  of  sign  of  the 
expectation  of  life  and  of  the  annuity  will  clearly  he  t\ex  and  t\ax  respectively. 
Where  t  is  in  the  first  instance  equal  to  e^-  and  in  the  seeond  to  the  term  of  the 
continuous  annuity  certain  </«,  =  dj-- 


10^ 


xotp:  a. 


On  the  Evaluation  of  the  Successive  Moments  of  the 
Binomial  Expansion  of  (p  +  q)'\ 

These    important   moments    may   be    found   very    simply    in    the 
following  manner.     The  expanded  series  being 

p-  +  np-'q+  ^ii^^p---V'+  .  .  .  +npq-'+q'^ 

=  '^iix,  where  the  subscript  is  identical  with  the  exponent  of  q, 

the  successive  moments  round  the  origin  will  be  ^Ux,  -xux,  ~:>riix, 
Ac.  AVe  will  first  find  the  value  of  :i«x,  ^xicx,  ^x{x-l)ux,  &c. 
We  have 

:^ux  =  (p  +  q)"=l"=l 
^mx  =  Oxp"+l  X  iip"-\i+-l  X  '*-^zi)p"-V-+  .  .  .  +nq" 

^Wilp  +  qY-'^^nq. 
Similarly 
^dx-\)ux^\  X  2  X  ''^-^^p--Y+-2  X  3  X  »fa-l)fa-3)^y.-3^^3 

+  .  .  .  +«('/i-l)(7" 
=  n{,i-\)q-y-'  +  {n--2)p''-\+  .  .  .  +./'--] 

=  n{n-\)q-[p  +  qY--^n{n-\)r 
and  similarly  wc  shall  find 

^c{x  -  \){x  -  •2)ux  =  n{n  -  l)(n  -  •l)(j\  and  so  on. 


"V+.  .  .  +f~^ 
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Hence  we  shall  have 


:Zu^=l 


2."nx  =  2.XUX  --=  nq 

v3,,       ^x{x-\)      _   iiin-l)   2 
^  Ux  —  ^  — -       Ux  — q 

v4„       yx{x-\){x-'l)  n{n-\){ji-i)  . 

^  Ux  =  ^ Ux  = ;; q 

b  b 

^5   •      s,x(x-l)(x-2){x-3)  _       nin-l)(n-2)(n-3)^i 

Ux  =  ^ tt:; Ux  = ~,  q 


24 


24 


by  the  formula  on  page  59.      Hence  we  have  {see  the  demonstration 
in  Note  E,  page  124),  using  vin  for  the  wth  moment  round  the  origin 


711(1  =        '^Ux  =  1 

uii  =      2"»j;  =  uq 

m.2  =    2^Ux  +  2"Hx  =  «('«.  -\)q-  +  nq 

m^  =    Q^Six  +  Q^^ix  +  ^'ux  =  n{n  -  1 )  {n  -  2 )q^ 

+  3n(n-  l)q'  +  nq 

Wi  =  2iy,hix  +  362;^«x  +  li^^ix  +  2-7<x  =  n{n  -  l){n  -  2){n  -  3)q^ 

+  Qn{n  -  l){n  -  2)q^  +  7n{n  -  l)g^  +  nqJ 


These  last  equations  may  be  found  directly,  by  means  of  successive 
difterentiation,  according  to  a  method  suggested  by  Bertrand 
{Calcul  des  Prohabilites,  Chap.  IV,  Art.  62).     AYe  have 


/  \n  II  11-1  UAn —  1)      «-0    '> 

( P  +  q)    -  P   +  np"    'q  +  -^- 'f'    -q- 


Up  +  qY  =  \0xv"+\ 
do  L 


d 


+  .  .  .   +  npq^  ^  +  q'' 


X  n2f      +  2  X  -^^ -p"   -q 


2 


and  q.  -~(p  +  q)"  = 
dq 


+  .  .  .  +{n-l)npq''   -  +  ??'?"   ^ 
lxnp''-'q  +  2x'"^'"--^^p''-'"q^+  ...   +wg"] 


1st  moment. 
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Similarly,  if  we  differentiate  the  last  series  with  respect  to  q, 
and  multiply  the  result  by  q  (to  restore  the  power  of  q  which 
is  lost  in  the  differentiation)  we  shall  have 

=  2nd  moment  ;  and  so  on,  so  that 

[fth  moment]  =q-r\it  ~  l)th  moment.] 
dq 

Thus,  the  first  moment 

=  qUp  +  qr 
dq 

=  nqip  +  q)"-'^ 
Second  moment 

=  Q T  [m(p  +  ^)" ~^]=  nin  -  1  )q-(p  +  '?)""-  +  nq(p  +  q)" ~ ^ 
dq 

Third  moment 

=  ql[n(n-l)q%p  +  qr-'  +  nq(p  +  q)"-'] 
dq 

^n(n-l)(n-2)q%p  +  qr-'  +  2n{n-l)q-ip  +  qr-' 

+  n{n  -  \)q^{p  +  q)"  ~ "  +  nqdi  +  g)"  " ^ 

=  n{n-\){n-2)q^{p  +  qY-''-VMn-l)q-{p  +  qY--  +  mip  +  q)"~^ 
Fourth  moment 

=  q—   [third  moment] 
dq 

=  n(n  -  l)(n  -  2)(n  -  S)q'{p  +  q)"-'  +  'Mn  -  l){n  -  2)q^p  +  q)"-^ 

+  Sn{n  -\){n-  2)q\p  +  qT'^  +  6n(w  -  \)q\p  +  qf  ' ~ 

+  n{n-  \)q-{p  +  q)'"'~-  +  nq{p  +  q)'"-~^ 

=  n(n  -  l){n  -  2){n  -  3)q'(p +  q)>'-' +  6/iU  -  l)(n  -  2)q%p  +  qY-^ 
+  7n(n  -  l)q-(p  +  qy"-  +  nqip  +  g)"-^ 

Putting  unity'"  for  all  the  powers  of  (p  +  q),  these  expressions 
are  the  same  as  previously  found — see  eiiuations  A. 

*  'I'liis  may  not  be  tlone  at  any  earlier  stape  because  the  differentiations  are 
with  respect  to  7,  takimi  p  constant,  wluTt-as  to  substitute  p-\-q  =  \  before 
f1Ili!^biIlg  llie  (liffcrentiationB  would  niakey>  vary  with  q. 
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We  have  thus  obtained  the  moments  round  the  origin.  Thence 
the  moments  round  the  mean  may  be  found  by  the  formulae  on 
p.  41.     Thus 

fX2  =  m-i  -  {mi)''  =  n{n-  \)q-  +  nq  -  trq'  =  nq  -  nq' 

=  7iq{\  -q)  =7ipq 

pi3  =  m-i  -  3/«i .  [1-2  -  mi 

=  n{n  -  1)(?«  -  2)5'^  +  3n(«  -  l)q'  +  nq 
-  3irq~  +  3n'q^  -  n^q^ 

=  nq  -  Snq-  +  2nq^  =  nq{  1  -  3g  +  2q") 

=  nq{l  -q){l  -  2q)  =  npq(p  -  q) 

Hi  =  riu  -  4?«i . (x-i  -  6m{'.fi2  -  r«/ 

=  nq[{n^  -  6n'  +11%-  6^  +  6(n-  -  3%  +  2)q^  +  7{n-l)q  +  l 
-  inqil  -3q  +  2(f)  -  Q>n^q-{\  -  q)  -  wV] 

=  nq[3{n  -  2)q^  -  6(w  -  2)g-  +  {3n  -  7)g  +  l] 
which  reduces  to 

nq{\-q)[3{n-2){\-q)q+l] 

=  npq\3{ii  -  2)pq  +  l] 

It  is  evident  that  all  the  even  moments  must  involve  p  and  q 
symmetrically ;  while  the  odd  moments  will  invoh^e  a  symmetrical 
function  of  p  and  q,  together  with  the  factor  {p  -  q),  because  they 
must  vanish  when  p  =  {7  (i.e.,  when  the  curve  is  symmetrical)  and 
must  only  change  sign  when  p  and  q  are  ti'ansposed. 


It  may  be  convenient  to  repeat  here  the  Author's  demonstration 
given,  J. I. A.,  xxvii,  214,  of  the  value  of  the  average  deviation  from 
the  mean  irre'^pective  of  sign,  that  is,  treating  all  the  deviations  as 
positive. 

If  we  suppose  the  event  to  happen  111  times  in  the  n  trials  the 
deviation  from  the  mean  number  np  will  be  (»i  -  np))  which,  since 
p  +  g  is  always  equal  to  1,  may  be  put  in  the  form  [inq  -  {n  —  m)p\ 


Ill 


This  will  be  positive  or   negative  as  m   is    >  or  <  np  ;  and  the 
probability  of  this  particular  deviation  will  be 

71  .  .  .  {in+  1)    „,  „  -  w 
\n-m 

The  gi'eatest  positive  deviation  will  be  nq  (when  the  e^'ent 
happens  at  all  the  n  trials)  ;  the  greatest  negative  de^"iation  -  vp 
(when  it  fails  at  every  trial). 

Hence,  we  have  the  following  scheme,  in  which  m  is  to  be  taken 
as  the  next  integer  <np. 

Possible  Deviations  from  Mean  Besidt  np. 


^ 

fcfi 
(A 

Magnitude 

Probability 

Mrtgnitude  x  Probability 

nq 

pn 

Mp"5' 

{n-l)q-p 

np^^-^q 

n{n  —  \  )|j»  -ig-  —  Hjo"gr 

{n-2)q-2p 

n.(n  —  V)    „   .,  .. 

n(n-l)(n-2)    „   .,  .,       ,       ,^        ,   „ 

9 

> 



! 

(w  +  1)2  —  {n—m  —  l)p 

\n-m-l      ^       ^ 

«   .   .   .   f  HJ  +  1 ) 

n  —  m  —  \     ^        '■ 

m  .  .  .  (w  +  2) 
»  — >n— 2 

mq  —  (n  —  m)p 

n  .  .  .  (m+  1) 
n  —  m       ^    ^ 

n  .  .  .  ni 

Vii  —  m^^ 

> 

»...(»»  +  !) 

|»  —  Wl  —  1      ^         ^ 

If 

...                     ...                     ... 

q-{n-\)p 

w^2*^  - 1 

npqn  -  n  .  (m  —  l)^2gn  - 1 

i 

1 

g» 

—  7ipq'>' 

If  the  final  column  of  ])roducts  is  examined  it  will  be  seen  that 
each  positive  term  is  cancelled  by  a  similar  negative  term  in 
the  succeeding  jiroduct.  Hence,  the  total  of  the  products,  that 
is  to  say,  the  average  deviation,  is  zero,  showing  that  np  is  the  true 
mean    result,    the   positive   and   negative   deviations   from   which 


112 


exactly  balance   each  other.     Of   the  terms  above  the  horizontal 
line,   representing  the  positive  deviations,   the  sum  is,  of   course, 

equal    to    the    only    uncancelled    term,      -  ;  '  '  ^"'+y  p^'^+Y-'", 

\n  -  111  -  1 

and    similarly   of    the    terms    below    the    line    representing   the 
negative      deviations      the      sum      is ^ -— ^p™+^g"  '"■. 


n  —  m—1 


Hence,  the  average  magnitude  of  the  deviations,  that  is,  the  total  of 
3very  possible  deviation  multiplied  by  its  probability,  regardless  of 
sign,  is 


o  . 


n. 


.(m+1) 


71-731-1 


vi+l^n-m 


2\n 


■m  n-m— I 


pm+Y-^ri  _  _    (^) 


which,  since  the  sum  of  all  the  probabilities  is  necessarily  1,  will 
also  be  the  average  or  mean  deviation.  This  result  is  exact,  not 
approximate,  but  where  n  and  rii  are  large  numbers  it  is  necessary  to 
simplify  it  by  tlie^use  of  Stirling's  formula,  which  gives  for  large 
numbers  \n=  v27r7i"+*e~"  nearly. 
Put  {a)  into  the  equivalent  form 

2\n{n-m)   „t+i^n-m  . 
'III  71  -  m 


using  Stirling's  approximation  to  the  factorials,  we  have 

=  n''+hii-^"'^Hn  -  m)-^''-"'+^n  -  m)p"'+Y~" 


J2 


TT 


Since  m  is  the  integer  immediately  below  np,  we  may  write 
m  =  np-h  ;   n-  m  =  iiq  +  k  (where  k  is  a  fraction) ;    hence,  we  get 


V- »'"{»Ki  -  -); 

-lUJ-i+k 

_     \        nq/J 

I,    k 

-"5  +  5 

i-k 

-k 

np+l-kq)iq+k 

>  TT          \       npJ 

V        Tiq/ 

1 

np 

but  where  np  and  nq  are  large  numbers,  I:  being  a  proper  fraction, 
the  last  factor  is  A^ery  nearly  equal  to  1,   and    (1 j         and 

1  H — -)        are  very  nearly  equal  to  c^'  and  c~^"  respectively  ;  hence, 

nq/ 

the  above  expression  reduces  to 

a/  "■  npq  =  -79788  Jnpq  =  ^  Jnpq  very  nearly. 
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Although  this  result  has  been  obtained  on  the  assumption  that 
np  and  nq  are  large,  it  will  be  found  to  be  very  approximate 
even  for  small  numbers.  As  an  extreme  case,  suppose  120  lives  at 
risk,  the  probability  of  death  in  each  case  being  -02 ;  the 
"expected"  deaths  would  then  be  2-4,  and  the  extent  by  which 
the  actual  deaths  would,  on  the  average,  exceed  or  fall  short  of  this 
number  would  be  given  by  the  formula  as 

-  n/2-4x -98  =  1-227. 
5 

The  true  value  of  the  average  deviation  given  by  formula  (a)  is 
2  120J2M18(.oo)3(.98)ns 

=  1-243, 
almost  identical  with  the  approximate  result  above, 
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NOTE    B. 


On  the  Use  of  Logarithms  of  the  Unadjusted  Terms 

OF  A  Series. 

Consider  the  number  of  cases  out  of  a  given  series  falling  into 
a  particular  group ;  or  the  number  of  deaths,  or  analogous  events, 
at  a  given  age  or  group  of  ages,  accruing  out  of  a  given  number  at 
risk.  Suppose  the  series  to  consist  of  n  cases  in  all,  and  let  the 
true  probability  of  any  case  falling  into  the  particular  group  be  p, 
and  let  m  =  np.  Let  the  observed  number  of  cases  in  the  group  be 
m  =  m  +  z,  where,  as  we  have  seen,  ~  has  an  average  value  of  zero, 
z^  has  an  average  value  of 

n 
z^  has  an  average  value  of 

np{\  -p){l-  2p)  ^  ^Il(lLzlllk^M,  &c. 

If  we  operate  with  the  logs  of  the  observed  quantities  m,  we 
must  avoid  by  arbitrary  grouping  cases  in  which  m  is  zero,  or  m  in 
very  small  when  the  logs  become  infinite  or  very  great ;  but  when 
this  is  done  we  shall  still  find  the  logs  of  the  ungraduated  numbers 
less  on  the  average  than  the  values  of  the  graduated  (or  true) 
numbers.  This  may  be  easily  seen  from  a  simple  example.  Let 
11  =  4,  and  p  =  l,m  which  case  m  =  np  =  2.  The  observed  values  of 
m  may  be  anything  from  0  to  4,  and  we  shall  have  the  following 
possible  cases  : 


Values  of 
m'  =  m  +  l 

Relative 

frequency 

of  these  values 

log  m' 

Products 
(2)x{3) 

(1) 
0 
1 
2 
3 
4 

(2) 

0 

Ti? 

4 
IS 

1 

(3) 

^-•097 

•000 ; 

•301 
•477 

•602 

(4) 

(say) -•030 

•113 
•119 

•038 

Total 

1 

•240 
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Here,  to  avoid  the  cases  in  which  the  observed  value  of  m'  is 
zero,  we  have  combined  the  first  two  groups,  taking  four  cases  in 
which  m'  =  1,  for  one  case  in  which  m  =0,  thus  giving  an  average 

value   of    m'=-,   the   logarithm    of   which  is    --097.     Notwith- 
5 

standing  this  device  our  average  value  of  log  m'  is  only  -240  as 
compared  with  the  value  of  log  m  = -301  (where  m  =  2  is  the  true 
value  or  average  value  of  vi). 
Assume  that  on  the  average 

log[Hi'(l  +l')]  =  \ogm 

==log[(m  +  ,~)(l+/.-)] 

=  \ogm+  -  -  -^  +  -^,  &c.,  +  /.•  -  ~+  Sco. 
m      2m-'      dm  2 

Whence 

k—  — h    &c.  —  — -  +  -^^,  —  £-^,  &c. 

Insert  the  average  values  as  given  above  for  z,  z^,  &c., 

1      Jc^      c         n-  m      (n  -  m){n  -  2m)  ,  „ 

A—  -  +  &c.  =  - —  -  ^ -^V^3 +  &c., 

2  2nm  dn'm" 

or,  omitting  terms  of  the  second  order, 

,      n-m  , 

A;=  ^ nearly, 

2nrn 

which,  again  omitting  terms  of  the  second  order,  may  be  written 

n  -  in 


h  = 


2nm 


log  [m'il  +  k)]  =  log    m'  +  ^~~^    =  log 


,      1-;/ 


where  p  is  the  observed  value  of  the  probability  p. 

If  this  expression  be  substituted  for  log  m  in  the  example 
given  aljove,  we  should  have  as  the  sum  of  the  products  of 
col.  (2)  X  col.  (3)  the  value  -309,  which  is  very  much  nearer  the 
true  value  -301  than  the  uncorrected  value  in  the  above  table.  If 
we  take  larger  numbers,  as  n=  100,  m  =  np  —  10,  we  shall  find  ])y  a 

similar  process  the  average  value  of  logiofm'n -^  j  is  -99987  as 

compared  with  the  true  value  of  log /y«  =  1-00000.  Where  the 
num1)ers  n  and  m  are  very  large,  the  correction,  of  course,  Ijecomcs 
insignificant. 
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It  may  be  shown  in  like  manner  that,  if  we  are  dealing  with  the 
reciprocals  of  the  observed  values,  then,  on  the  average, 

111 
7  =  -  nearly, 


m  +1  -p       in 
and  again,  on  the  average, 


III  ^ — —  =  V 


m 


Eeverting  to  the  question  of  the  use  of  the  logs  of  the 
ungraduated  quantities,  it  will  be  found  that  if  the  above  results 
are  made  use  of  in  practice,  the  logarithms  will  be  over-corrected. 
The  reason  for  this  is  that  we  do  not  eventually  arrive  at  the  true 
values  of  log  rii  and  log  p,  the  graduated  values  being  still  aflected 
by  an  outstanding  or  unbalanced  error.  If  our  series  consists  of  a 
large  number  of  groups,  these  outstanding  errors  will  be  com- 
paratively sm.all,  and  the  above  correction  will  not  be  much  in  excess  ; 
but  if  the  number  of  groups  is  very  small,  our  graduated  quantities 
must  necessarily  follow  rather  closely  the  original  values,  and  the 
use  of  the  above  formula  would  largely  over-correct  the  series. 
Suppose,  for  example,  Ave  had  a  series  of  ten  groups.  We  should 
require  about  five  groups  to  obtain  the  general  form  of  the  curve, 
or  to  determine  the  constants  of  any  frequency  curve  employed,, 
hence  the  errors  of  the  gi'oups  would  only  be  reduced  by  the  ratio 

of  approximately  — r-  and  the  correction  h  as  shown  above  should 
be  reduced  by  half,  and  proportionately  in  other  cases. 
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NOTE    C, 


Ox  THE  Rationale  of  the  Method  of  Least  Squares. 

In  statistical  work  it  often  happens  that  a  number  of  constants, 
entering  into  the  known  mathematical  form  of  a  given  function, 
have  to  be  evaluated  from  a  much  greater  number  of  observed 
values  of  the  function.  We  may,  for  example,  have  three  constants, 
such  as  X,  y,  z,  in  the  expression  Ix  +  )ny  +  nz  =  F,  and  fifty  observed 
values  of  F  (embodying  different  values  of  the  coefficients 
/,  m,  n)  from  which  to  determine  the  constants.  If  the  observed 
values  of  F  were  rigidly  accurate,  any  three  of  them,  or  any  three 
combinations,  would  suffice  to  determine  the  constants,  and  it 
would  be  immaterial  what  set  of  three  was  selected,  since  all  would 
lead  to  the  same  results.  But  generally  the  observed  values  of  F 
will  be  affected  by  errors  of  observation  and  hence  will  not  be 
strictly  consistent ;   and  taking  the   above    example    each   of    the 

' — ^ — - — =1960   diflerent   sets    of   three    individual    equations 

would  in  general  produce  different  values  of  the  constants  :  so  that, 
apart  from  the  prohiljitive  amount  of  labour  required  in  the  solution 
of  so  many  equations,  we  should  have  no  means  of  deciding  which 
was  the  best  or  most  advantageous  solution,  or  how  to  combine  the 
solutions  in  order  to  ol)tain  the  best  average  results.  The  method 
of  least  squares  sujjiilies  the  means  of  combining  the  original  observa- 
tions in  such  a  manner  as  to  produce  a  number  of  eiiuations,  equal 
to  the  number  of.  unknowns  (in  the  above  example,  three),  the 
solution  of  which  by  the  usual  process  leads  to  the  most  probable 
values  of  the  unknown  constants. 

Suppose  that  the  observed  function  F  is  a  linear  function  of  the 
variables  x,  y,  r,  of  the  form  Ir  +  iiiy  +  n:  .  .  .  ,  and  that  the  errors 
in  the  observed  values  of  F  follow  the  "normal  law",  so  that  the 
l)robability  of  an  error  k  is  projjortional  toe~'^^^''^,  where  the  standard 

deviation  of  F  is  —7-.     We  shall  furllirr  suppose  that  the  equations 
V2 
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have  been  so  Aveighted  that  the  vahie  of  c  is  the  same  in  each  of  the 
observations,  or  that  the  "  precision "  is  uniform.  Thus,  for 
example,  if  in  a  given  equation  the  proljability  of  an  error  of  k  in  the 
observed    value    of  F  is   proportionate  to  e~^'^'^^,  with  a  standard 

deviation  of  -4-,  then  multiplying  the  equation   by  -,  we  shall 

have  an  equation  with  a  standard  deviation  of  — 7=  as  before,  and 

n/2 
the    probability    of   an    error   k   will  be  proportional  to  e~^"''^"  as 
required. 

Let  there  be  t  equations  as  follows  (where  t  is  supposed  greater 
than  the  number  of  unknowns,  say  s) : 

lix  +  miV  +  riiz  +  .  .  .   -WiF  =  ki 


lox  +  vioy  +  noz  +  .  .  .   -W2F  =  h. 


(A) 


Itx  +  mty  +  ntz+  .  .  .  -tVtF  =  kt- 

where  F  represents  the  true  value  of  the  observed  function  and 
A'l,  k-2, .  .  ■  the  errors  of  observation.  The  chance  of  the  errors  being, 
by  hypothesis,  respectively  proportional  to  e~'^'^^''\  g-^2^/c-  _  _  _  ^j^g 
chance  of  the  conjunction  of  these  individual  errors  Avill  be  propor- 
tional to  e"''''^''''"'*'''"-'''^^"'"- • -^j  which  will  obviously  have  its  greatest 
value  when  the  quantity  in  brackets  is  a  minimum.  Now,  the  most 
probable  values  of  the  constants  will  be  those  that  give  the  greatest 
probability  of  the  observed  event,  i.e.,  the  happening  of  the  given 
combination  of  errors.  Thus,  the  most  probable  values  will  be  those 
making  [jiY/c^  +  k.j'^/cr  +  .  .  .]  or  ~kt^/c^  or  2A'(^  a  minimum— hence 
the  name  "  method  of  least  squares." 
Now  we  have 

2Ar  =  2[(/,7:  +  mty  +  ritz  +  ...  - w.F)^] 

and  since  ,r,  y,  z  .  .  .  are  supposed  to  be  independent,  the  minimum 
value  must  correspond  to  such  values  of  .r,  ?/,  ~  .  .  .  as  will  make  the 
partial  differential  co-efficients  of  this  expression,  with  respect  to 
X,  y,  z  .  .  .  ,  all  vanish.'"'  Hence  we  must  have,  omitting  a 
common  factor  2, 


,A 


(B) 


^[tStX  +  mty  +  ntZ+  .  .  .  -mvF)]  =  0 
':^[mt{ltx  +  mty  +  ntz+  ...  -  w^F)]  =  0 
:^[nt(ltx  +  mty  +  ntz+  .  .  .  -WtF)]  =  0 

^c,  &c.,  &c. 

*  These  conditions,  though  necessary,  are  not  in  general  .sufficient  to  ensure  a 
minimum,  but  in  this  ease  it  is  obvious  that  a  minimum  exists  because  high 
negative  values  and  \\\gh positive  values  of  x,  y,  e  .  .  .  alike  give  large  values  to 
the  function. 
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(the  sunmiation  extending  to  all  values  of  /)  as  the  system  of 
equations,  s  in  number,  for  determining  the  most  probable  values  of 
.r,  y,  z.     Hence  the  rule  : 

"  First  prepare  the  equation  by  multiplying  each  by  its  proper 
weight  (the  reciprocal  of  the  probable  error  or  standard  dcA-iation), 
thus  giving  a  set  of  equations  with  a  uniform  p.e.  and  s.d.  Multiply 
each  equation  by  the  coefficient  of  x  and  add  all  the  results  together  ; 
next  multiply  each  by  the  coefficient  of  y  and  add  all  the  results 
together,  and  so  on  :  the  resulting  aggregate  equations,  solved  in 
the  usual  manner,  give  the  most  probable  values  of  the  constants." 

It  will  be  seen  at  once  that  if  there  is  only  one  constant  to  be 
determined,  the  method  based  on  the  normal  law  of  error  gives 
the  weighted  aA'erage,  i.e.  the  total  of  the  weighted  values  divided 
by  the  total  weights,  as  the  most  probable.  Conversely,  it  may  be 
shown  that  if  the  Aveighted  average  is  the  most  probable  value,  then 
the  facility  of  error  must  follow  the  normal  law.  Apart,  however, 
from  any  hj-pothesis  as  to  the  law  of  error,  it  may  be  shown 
mathematically  that  the  method  of  least  squares  gives  results  which 
become  more  and  more  nearly  accurate  as  the  number  of  observations 
increases.  Considerations  of  a  more  general  kind  will  also  lead  to 
the  conclusion  that  the  method  must  produce  very  good  results. 
Without  giA-ing  any  definite  form  to  the  law  of  error,  it  is  obvious 
that  large  errors  are  less  probable  than  small,  and  that  the  most 
advantageous  system  of  values  for  the  unknown  constants  Avill  be 
that  which  produces,  on  the  whole,  the  smallest  numerical  dcA-iations 
(irrespective  of  sign)  between  the  adjusted  and  observed  values  of 
the  function.  Xow,  if  the  law  of  error  is  supposed  unknown,  we 
cannot  investigate  mathematically  the  conditions  required  to  produce 
a  minimum  deviation  irrespective  of  sign  ;  and  the  simplest  function 
of  the  errors  which  is  independent  of  sign  is  the  square  of  the  errors, 
which  will  be  the  same  for  a  positive  or  negative  deviation. 
and  at  the  same  time  attributes  a  rapidly  increasing  importance,  or 
disadvantage,  to  the  errors  as  they  increase  in  magnitude.  Hence 
we  can  see,  in  a  very  general  way,  that  a  method  which  gives 
a  minimum  value  to  the  sum  of  the  squares  of  the  errors,  is  likely 
to  lead  to  satisfactory  results  consistent  with  elementary  notions  as 
to  the  nature  of  the  errors.  Moreover,  in  actuarial  work  we  usually 
have  to  do  with  numbers  sufficiently  large  to  make  the  normal  law 
of  error  very  near  the  truth. 

Reverting  to  the  system  of  equations  (B),  it  will  easily  be  seen 
that  if  F  is  a  parabolic  function  of  the  form  x  +  ay  ■ira'z  + .  .  .  the 
equations  for  determining  .r,  y,  z,  .  .  .  ^c,  are  equivalent  to 
reproducing  ^Y ,  laY ,  ^^Y  {^ti^,  'ZwY.a,  &c.,  if  the  eciuations  are 
weighted),  i.e.,  the  successive  moments  of  the  observations. 
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It  has,  so  far,  been  supposed  that  the  function  F  is  a  linear 
function  of  the  constants  x,  y ,  z,  .  .  .  If  this  is  not  the  case, 
suppose  that  the  equally  weighted  equations,  from  which  the  values 
of  a;,  y,  z,  .  .  .  are  to  be  found,  are  of  the  form 


fi{x,  y,  z  .  .  .)-  WiF  =  Jci 
foio;  y,  z  .  .  .)  -  WoF  =  h  - . 
&c.,         &c.,         &c. 


...(c) 


where  /i,  /o  ■  .  •  are  known  functions  of  the  variables  x,  y,  z  .  .  . 
By  means  of  t  of  these  equations,  or  of  t  combinations  from 
amongst  them,  or  otherwise,  find  approximate  values  of  x,  y,  z  .  .  . 
say  x},y^,z^  .  .  . ;  and  suppose  that  x  =  x^  +  8x,  y  =  y^  +  8y,  z  =  z^  +  8z, 
&c.,  where  it  may  be  supposed  that  8x,  8y,  8z  .  .  .  ,  representing 
small  corrections  to  be  found,  are  so  small  that  their  squares 
may  be  neglected.     Then  if 


dfx 


I 


fi=fiix\  y\  z'  .  .  .) ;  fl  =  f-J,{x\  y\  zK..)  &0 

ax       ax 


+ 


and  so  on,  equation  (C)  will  become 


&c. 


&c. 


.  .   -  tViF  =  ki 
&c.  J 


y 


(D) 


These  equations  are  linear  functions  of  the  small  corrections 
8a;,  8y,  8z  .  .  .  which  can  accoixlingly  be  found  by  the  rules  already 
derived ;  and  hence  are  found  the  corrected  values  x  =  x^  +  8x, 
y  =  y^  +  8y,  &c.  The  process  can  l)e  repeated,  if  greater  accuracy  is 
desired,  until  the  corrective  terms  become  insignificant. 

In  the  important  particular  case  of  a  graduation  by  Makeham's 
formula,  the  original  equations  are  of  the  form 


U'_^.^i.(A  +  B/^-^) 


da- 


W 


^■+:;E 


I  -i  -2. 


{w  being  the  "  weight  ").     Approximate  values  of  the  constants,  say 
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A,   B',   c,   being  found,   the  resulting  equations  for  determining 
3A,  SB,  Be  are  as  follow,  /x'^  representing  A'  +  B'r;'^  : 

(2^)8A  +  {^w.  c  "^  bsB  +  {^w^c  '"  L-  +  l^Bc 


(:Sw . B'x  +lc"   2 )SA  +  (2w . B'x  +  - c'-'')8B 


+  {22r.B'-(.r+l)%'--i|s, 


1    ,a;— 


For  an  example,  see  J.I. A.,  xvii,  161-71. 
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NOTE    D. 


On  the  Use  of  the  Binomial  Curve  to  Repkesent 
A  Continuous  Series. 

If  the  Binomial  curve  y  =  , — j- — pY''  ^  niade  high  contact  with 

\x  \n-x 

the  axis  of  x  at  the  points  x=  -I  and  x  =  n+l,  where  y  becomes 
zero,  it  could  be  conveniently  employed  to  represent  a  continuous 
curve  in  lieu  of  representing  merely  isolated  ordinates ;  as  in 
that  case  the  moments  of  the  continuous  curve  would  very 
closely  agree  with  those  of  the  isolated  ordinates.  The  same 
would  be  true  of  any  series  of  equidistant  points  on  the 
curve  supposing  these  to  be  fairly  numerous.  If,  for  example, 
we  suppose  the  values  of  y  tabvilated  for  every  integral  value 
of  xh,  then  the  tth.  moment  of  the  curve  would  be  increased 
by  multiplication  by  the  factor  h\  and  from  the  observed 
numerical  values  of  the  first  4  moments  h  and  the  remaining 
constants  could  be  obtained.  As,  however,  the  curve  y  cuts  the 
axis  of  X  at  an  angle  at  both  limits,  this  method  of  proceeding 
will  lead  to  approximate  results  only  when  n  is  fairly  large. 

The  area  of  y  treated  as  a  continuous  curve  may  be  approxi- 
mately determined  from  the  well  known  approximate  formula 

/•"+!     ,1  11  rfdv\         fdy\      1 

j_^  y.dx=  -.j-^  +  yo^.  .  .  +  yn+  -^.A+,  +  i^LltJ-i  -UAJ 

y-i  and  yn+\  being  of  course  equal  to  zero  and  the  series  yo  +  yi  +  ■  •  -Vn 

is  the    expansion    of    ip  +  qY^  where  we  assume  p  +  q=\,  and    is 

1  1 

therefore  also  =1.     As  the  factor  jT  vanishes  for  a;  =  -  1 ;  and 

\Jj  Jv  —  Jy 

vanishes  for  x^n  +  l,  we  have 

\dx/ n+i      \x^  ■'^       dxn-x/n+i  n+l 
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d  1         .     ,  ,      , 

si^ce  —  |-  as  IS  known  =  1   when  x--  -I.     Hence  the  area  of  the 

curve  y  becomes 

ydx=l  +  ^.~^.(^ +£_) 

•'-I  1271+lV        pq        J 

Analogous  expressions  can  be  found  for  the  approximate  value  of 
the  moments 

fxydx,   Jx-ydx 

fydx  '     fydx  '   '^''- 

but  the  relations  which  result  do  not  lead  to  sufficiently  convenient 
formulae  for  practical  use. 
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NOTE    E. 


On  the  relations  between  the  Successive  Moments  and 
THE  Successive  Summations  of  a  series. 

The  relations  given  on  p.  60  may  be  systematically  demonstrated, 
and  developed  to  any  extent  that  may  be  required,  by  means  of  the 
ordinary  interpolation  formulae  combined  with  a  table  of  the 
power-differences  usually  known  as  the  "  Differences  of  Nothing  " 
—see  Text-Book,  Part  II,  Ch.  xxii.  Art.  11 ;  Sunderland's  "Notes 
on  Finite  Differences  ",  pp.  24-5. 

We  have,  by  the  ordinary  interpolation  formula, 


.  x.x-\  .1 

%  =  Vo -t- a;A?;o  +  — - — A  «;o+  • 


and  hence  % .  ■z^a;  =  %  -^'o  +  ^^Wa;  •  ^"^o  +  ~^ —  %  •  ■^"^'o  +  •  •  • 

so  that   %ia^V:c)  =  (-»a;)^'o  +  (^xiQ . Avq  +  ^i^'-^'     n^j . A'c^  +  .  .  . 

=  {-Euoho  +  {^\)  •  At'o  +  (2'«.o) .  A\-o  +  ... 
using  the  notation  of  p.  60. 
Put  Vx  =  x"^  and  we  have 

2a;"\  w^  =  (2wo)0'"  +  (2-»i)A0'"  +  (:S^fo)A-.  0'"  -f- .  .  . 

Putting  «i  equal  successively  to  1,  2,  3  ... ,  taking  the  differences 
from  the  table  of  the  differences  of  nothing,  and  noting  that  the 
first  term  vanishes  whatever  the  value  of  m,  we  can  write  down 
at  once — 

^X.Ux  =  '^'Ul  ^ 


^x^lx  =  ^'^h  +  ^^% 


)..A 


2xhtx  -  2-i/,i  -I-  6^%  +  G^% 

Ixhix  -  2-«i  +  ?>0^\i.  +  1502^%  +  2402'«4  +  1202^i/J 

These    equations,    divided  by    Sifo    give    the    expressions   for   the 
moments  set  out  on  page  60. 
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Taking  next  the  usual  central  difference  formula, 

—  On  +   — — : Cn  ^ ^^ iL 


Vx  =  Vq  +  xao  +  —  ^'o  + 


,    ,r(a:^-l)(a;2_4) 

"^  g  '' 


—  lQ  +  XaQ+  19^  oi ,„ 


CqH r; "0 


+ 


{x-l){x-\)x{x+\){x+2) 


Co 


2  i2  [3 

_^  {{x  +  2){x  +  1K>;  -  1)}  +  {{x  +  IM^  -  l)(a:  -  2)}  dp  ^ 
^  2  ■    14  "^ 


Thus, 


If, 


1*0 


the  law  of  the  terms  being  manifest ;  or,  abbreviating  the  expression 

by  the  single  symbol  ^'ux+i,  the  series  may  be  -written 

^{uxvx)  =  {^Uo)vo  +  {^\)ao  +  {'^\h)  h  +  i^'u.)  A)  +  .  .  . 

Putting  Vx  =  x"\  forming  the  central  differences  of  r/;"*  as  shown 
in  the  scheme  below,  we  write  doAvn  at  once 


2.XIIX  =  2  Ml 

^c'ux  =  2l\i 


B 
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v  = 

X2 

v- 

=  a;3 

1 

v=x* 

X 

Vj, 

A 

A- 

X 

Vx 

A 

A2 

A=« 

jr 

Vx 

A   A- 

A»   A^ 

-2 

4 

-3 

-3 

-27 

19 

-3 

81 

-65 

-1 

1 

-1 

2 

-2 

-  8 

7 

-12 

6 

-2 

16 

50 
-15 

-36 

0 

0 

(0) 
1 

2 

-1 

-  1 

1 

-  6 

6 

-I 

1 

14 
-  1 

24 

-12 

1 

1 

3 

2 

0 

0 

(1) 

1 

0 

(6) 
6 

0 

0 

(0)   2 

1 

(0)  24 
+  12 

2 

4 

1 
2 
3 

1 

8 

27 

7 
19 

6 

12 

6 

1 
2 
3 

I 
16 
81 

14 
15 

50 
65 

24 
+  36 

The  simplification  in  the  formulae  is,  of  course,  due  to  the  fact 
that  when  m  is  even  the  odd  central  differences  vanish,  and  when 
m  is  odd  the  even  central  differences  vanish. 


It    is    sometimes    required    to    find    moments    of    the    form 

For  this  purpose  we  may  use  the  formula  {see  "  Sunderland's  Notes 
on  Finite  Differences,"  p.  32) — 


r(,r-l) 


xix-l){x-h) 


■'Ox  =  i(vo  +  vi)  +  (x  -  I) At'o  +  '""2  "'  2 A^(^o  +  v-i)+  "ig"" — -  A^'t- - 1 

.  x{x-l){x+}){x-2)l 


2 


A*(r-i  +  r-o)  +  , 


1  /  \         1  /  1  \  A  x(x  —1)1.0/  \ 


_,  1    (a-+l).r(.^-l)  +  a;(^--l)(a;-2)  . « , 

+-   —  •   , L\    t'  -  1 

2  3 


+ 


{x+l)x(x-l)(x-2)l 


A\v-i  +  v-o) 


Avhence  we  find,  in  the  same  manner  as  before,  that  commencing 
with  ViWi,  we  shall  have 

2vx.wx  =  ^wi.'-^'  +2Vii.Aro  +  2V-  ^(A^t'o  + A^^-i) 

+  SV,iA=^^!- 1  +  3-^3  J  (A\'-i  +  A%-o)+  .  .  . 
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Putting  .,  =  ^-^4-.;)     =^-^^^ 

the  following  Table  shows  the  values  of  (2.r  -  1)'"  and  its  differences, 
whence  Vx  and  its  differences  will  be  found  by  dividing  by  2'". 


F  2x-l 

(2:r-l)' 

A 

(2x-l)= 

A 

A- 

(2x-l)^ 

A  A"-      A3 

(2x-l)^ 

A 

A- 

A=«   A^ 

I       -5 

—  5 

2 

25 

-16 

-125 

98 

625 

-544 

L   -3 

-3 

9 

8 

-  27 

-72 

81 

464 

2 

-  8 

26     48 

-  80 

-384 

)   -1 

-1 

1 

8 

-  1 

-24 

1 

80 

384 

0 

2 

(1) 

0 

(8) 

(0) 

2   (0)  48 

(1) 

0 

(80) 

0  (384) 

L   +1 

+  1 

2 

1 

8 

8 

+  1 

+  24 
26      48 

1 

80 

80 

384 
384 

I        +3 

+  3 

2 

9 

16 

8 

+  27 

+  72 
98 

81 

544 

464 

5   +5 

+  5 

25 

+  125 

625 

Dividing  by  the  appropriate  power  of  2  and  inserting  the  values 
of  -  (^0  +  vi),  A?\„  ^  {A'Vo  +  A^v- 1),  &c., 

the  last  formula  becomes 


^  ^/2.r  -  1\ 

2.VxWx  =  2( — ^ — I 

=  (when  m  =  l)     -"Wij 


Wx ,  commencing  with  (  - )  zvi 


(C) 


=  (when  m  =  2)  2'^Wo  +  -  Smi 

4 

=  (when  m  =  3)  62^tv.,i  +  -  ^rwi^ 

4 

=  (when  m  =  4)242%3  +  b^hv.  +    "^  ^ic^ 

16 

Writing  now  ui  =  u-q,  and  so  on,  i.e.,  reckoning  the  ordinates  from 

zero,  so  that  the  moments  are  of  the  form  ^^(7)    +"3( '  )     +  .  .  . 

these  become 

4 

4 

N  in^  =  2i'^'i(.,}.  +  5^^Ui),+  — -MA 

16 


(T)) 
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This   will   be   made   clearer   by  a  numerical  example.     Take  the 


following  series. 


Distance 

from  origin 

X 

Ux 

multiplied 
by  2 
=  d 

Vx  X  d 

Ux  ^  d- 

Vx  X  d^ 

ttx  X  <£■* 

-5 

16-74 

1 

16-74 

16-74 

16-74 

16-74 

1-5 

15-69 

3 

47-07 

141-21 

423-63 

1270-89 

2-5 

14-70 

5 

73-50 

367-50 

1837-50 

9187-50 

3-5 

12-99 

7 

90-93 

636-51 

4455-57 

31188-99 

60-12 

228-24 

1161-96 

6733-44 

41664-12 

-^-2  = 

-f-4  = 

^8  = 

-^16  = 

114-12 

290-49 

841-68 

2604-01 

The  alternative  method  by  summation  will  be  as  follows  : 


X 

Mx 

2zij 

^-Ux 

2%,: 

2^"x 

25«x 

•5 
1-5 
2-5 
3-5 

16-74 
15-69 
14-70 
12-99 

60-12 
43-38 
27-69 
12-99 

144-18 

(114-12) 

84-06 

40-68 

12-99 

137-73 
53-67 
12-99 

204-39 

(135-525) 

66-66 

12-99 

79-65 
12-99 

114-12 


22%i+72:«A  =  2x  137-73+^  x  60-12 
4      '  4 


=  275-46 +  15-03  =  290-49 

62*M2+- -""1  =  6x135-525+  -  x  114-12 
4  4 

=  813-15  +  28-53  =  841-68 

242^M2i  +  5SVi+  A-wi  =  24x  79-65  + 5  X  137-73+  —  x  60-12 
16      '  16 

=  1911-60  +  688-65  +  3-76  =  2604-01 

With  a  heavy  series  of  terms,  the  saving  of  labour  by  the 
summation  method  will,  as  may  easily  be  seen,  be  very  considerable. 
A  further  saving  of  labour  may  be  obtained  by  calculating  the 
moments  round  some  convenient  central  point,  and  thus  breaking 
up  the  series  into  two  parts  in  the  manner  indicated  in 
Mr.  Elderton's  treatise,  pp.  22-33  ;  and  any  of  the  formula  described 
in  these  notes  may  be  applied  in  this  manner. 
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NOTE    F. 


On  the  Identity  of  the  Method  of  Moments  and  Method 
OF  Least  Squares  in  the  Case  of  an  Exponential 
Function. 

Suppose  y  an  exponential  function  of  x  so  that 

2/  =  e«+^^+<^^^+'^^-  =  e^say. 

Then  if  y  be  taken  to  represent  any  group  in  a  frequency  distribu- 
tion where  the  number  of  groups  is  large  the  probable  error  in  y 
will  be  approximately  Jy.  Assume  the  true  values  of  y,  i.e.,  the 
true  values  of  a,  i,  c  .  .  .  ,  to  be  approximately  known,  and  let  the 
observed  values  of  y  be  denoted  by  y.      If,  then,  we  weight  each 

equation  y  -(r  =  Q  by  the  factor  — .-,  writing 

'Jy 

J-{y'-,^)  =  Q       .......      (1) 

we  shall  have  a  series  of  equations  of  condition  in  which  the 
probable  error  is  in  each  case  identical ;  that  is  to  say,  they  will  be 
suitably  weighted  for  the  application  of  the  method  of  least 
squares  {see  Note  C,  p.  117-8). 

Writing  y'  =  {y  +  U.'^y-+Bh.  ^ff  +  &c.) 

da  do 

=  2/  (1  -I-  5a  +  x.  85  +  x^.  8c,  &c.) 
equation  (l)  becomes 

\[y{l  +  U  +  x.^h  +  &c.)-(f]  =  0    ....     (2) 
sly 

and  multiplying  each   C(iuation   successively  by  the  coefficients  of 

-8«,  oh,  iK;c.,  i.e.,  by-^-,    ^-'-x,-^^   x',  &c.,  and  taking  the  sum  of  each 
v2/    "Jy     "Jy 
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set  of  products,  according  to  the  rules  of  the  method  of  least 
squares,  we  get 

I[y{ l  +  8a  +  u'.8b  +  ar .  8c,  &c.)  -  r]  =  0 

2MI  +  8a  +  o:.8b  +  x\8c,  &c.)  -  o\e']  - 0 
&c.,  &c. 

as  the  system  of  equations  for  determining,  according  to  the 
method  of  least  squares,  the  small  corrections  to  be  applied  to  the 
approximate  values  a,  h,  c,  .  .  .  used  in  obtaining  the  approximate 
values  of  y. 

Now,  obviously,  if  y  is  so  taken  that 

^xiy-e'')   =0 
2ar{y  -  e')  =  0,  &c. 

i.e.,  if  the  values  of  the  constants  a,  h,  c  are  found  by  the  method 
of  moments,  &c.,  the  above  equations  are  satisfied  by  8a  =  8h  =  8c  =  0; 
that  is  to  say,  the  corrections  are  zero,  or  the  values  found  for 
a,  i,  c  ...  by  the  method  of  moments  are  in  conformity  with  the 
method  of  least  squares  on  the  assumption  that  the  observations  are 

properly  weighted  by  multiplying  by  the  factors  — ,--,  the  weights 

\ly 

being  assumed  invariable.     It  may,  however,  be  supposed  that  small 

variations   in   the   constants,    a,    h,  c,   .  .  .    would  produce   slight 

variations  in  the  weights,  in  which  case  other  solutions  may  exist 

which  would  also  lead,  by  the  method  of  least  squares,  to  equations 

satisfied  by  8a  =  8J  =  8c  =  0  ;    but  as  it  is  well  known   that  small 

differences  in  weights  have  practically  no  effect  on  the  results,  it  is 

evident  that  any  such  alternative  solution  must  be  very  close  to 

that  already  formed. 
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XOTE    G. 


On   obtaining  the  value  of  Makeham's  constant  c 
direct  from  the  exposures  and  deaths. 


As  stated  in  the  text  an  exact  value  of  this  constant  is  not  very 
important,  and  this  may  be  illustrated  by  reference  to  the  data  for 
ascending  premium  assurances  given  in  Table  X.  An  approximate 
value  for  r  may  readily  be  found  by  a  process  such  as  the  following, 
which  is  in  principle  analogous  to  the  aggregate  method  employed 
by  Mr.  King  in  the  Text-Book,  Part  II.  Take  the  values  of  ju,  for 
the  central  age  of  each  group  in  Table  X.  Reject  the  initial  and 
final  values,  as  depending  upon  only  two  and  three  deaths  respectively. 
Take  the  six  values  for  central  ages  321  to  57  J,  weighted  respectively 
by  the  factors  1,  3,  5,  5,  3,  1  ;  weight  the  six  values  for  central 
ages  47 i  to  7 2 J,  and  also  for  62|  to  87->  in  the  same  manner.  We 
shall  then  ha-\"e  the  following  totals  : 


.^jH,  xl  =  -0119 

M47}xl  =  -0137 

/t62txl  =  -0340 

^37jx3  =  -0345 

yii5._,jx3  =  -0534 

/i67ix3  =  -1647 

^42j  X  5  =  '0655 

;U57iX5  =  -1160 

^„tx5  =  -3660 

/X47i  X  5  =  0685 

\t^i\y.  5  =  '1700 

Mm  X  5  =  '5720 

;u5.,jx3  =  -0534 

;U67iX3  =  -1647 

jU82jx3  =  -7540 

/i57}  X  1  =  -0232 

^-.,jxl  =  -0732 

/x87}xl  =  -3379 

Si  =  -2570 

82  = -5910 

83  =  2-1286 

If  the  mortality  follows  Makeham's  law,  we  shall  have 
8,  -  S2      1-5376 


Oo  —  fex 


•3340 


=  c 


since  15  years  is  the  interval  Ijctwccn  the  centres  of  our  empirical 
groups.     This  gives  log  c  =  -0442  nearly.     If  we  take  the  sum  of 

k2 
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the  unweighted  values  of  /x  in  three  groups  for  ages  32i  to  47i,  52^ 
to  67i,  and  721  to  87J,  we  should  obtain  in  similar  manner. 


20  _   ^3  —  ^5-2 

c     — 


•6136 
•0797 


giving  log  c  =  -0443. 


S.2  —  fej 

We  may  conclude,  therefore,  that  log  c  probably  lies  between 
•044  and  -045.  The  values  of  /x  for  ages  27^  and  92|,  which  we 
have  omitted  in  the  foregoing,  are  respectively  much  below  and 
much  above  the  general  curve.  If  these  values  had  been 
included  duly  weighted,  we  should  have  obtained  a  slightly  larger 
value  of  log  c,  nearer  to  -045. 

If  we  adopt  -045  as  an  approximate  value,  we  obtain  for  the 
values  of  the  constants  A  and  B,  by  the  process  described  on  p.  65, 

A  =  ^00950  B- •00003712 

We  will  call  this  curve  (a),  the  deviations  from  the  adjusted  values 
of   6  in  Table  X   being  shown  in  the  Table  below. 


We  might 


Ascending  Premium  Assurance  Experience. 
Deviations  in  Computed  Deaths  for  Curves  (a)  and  (f3). 


Middle 

Deviations 

Age 

Observed  Deatlis 

Computed  Deaths 

—Observed  Deaths              | 

of 
Group 

corrected  as 
per  Table  (X) 

Curve  (a) 

Curve   (;8j               ] 

log  c  = -045 

logo 

=  •046 

+ 

+ 

27^ 

•8 

•9 

... 

1-0 

32i 

29-2 

3-5 

f .  • 

2-8 

;    37i 

102-0 

1-8 

•2 

;    42i 

175-2 

.  •  . 

73 

5-9 

47i 

191-7 

12-8 

130 

i        52i 

218-6 

3-3 

■  .  • 

2-0 

•  •  • 

i       57i 

228-4 

7-3 

5-1 

•  •  • 

62i 

255-4 

■  •  • 

2-7 

•  •  * 

5-2 

:          67i 

274-4 

.  .  • 

24-8 

26-5 

!      72i 

205-6 

12-0 

12-0 

... 

;  -..  77i 

151-5 

121 

13-6 

>  .  • 

1        82i 

84-8 

>  •  > 

6-6 

51 

i       87i 

22-3 

-5 

•2 

•  •  • 

92i 

2-1 

... 

1-1 

... 

1-1 

Sum  of  deviations 

48-4 

1 
48-3 

46-9 

46-8 

Second  sum    .     . 

38-9 

37-5 

39-3 

39-9 

Third  sum       .     . 

13-7 

71-8 

31-9 

36-9 
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expect  from  our  first  rough  approximation  to  log  c  that  a  smaller 
value  than  -045,  say  -Oii,  would  give  better  results.  We  find, 
however,  that  the  third  sum  of  the  errors  of  the  (a)  curve  is 
negative,  and  this  indicates  an  increase  in  the  value  of  log  c. 

Since  a  higher  value  of  c  hollows  out  the  curve  at  the  middle 
ages,  increasing  the  computed  deaths  at  the  extremes  of  the  table, 
it  is  clear  that  the  eftect  must  be  to  increase  the  third  sum  of  the 
graduated  deaths. 

The  probalility  is  therefore,  that  curve  (a)  will  not  be  much 
improved  by  changing  the  value  of  c. 

If  we  take  the  alternative  value  log  c='04:6  we  find  the 
deviations  from  the  adjusted  values  of  B  in  Table  X  are  given  for 
curve  (3  on  the  previous  page- 

There  is  little  to  choose  between  the  two  graduations,  notAdth- 
standing  the  smallness  of  the  third  sum  of  the  deviations  in  curve 
ifS),  for  against  this  may  be  put  the  fact  that  the  three  largest 
errors  in  (a)  are  all  increased  in  (/i).  On  the  whole  the  curves  may 
be  taken  as  showing  that  an  approximate  value  of  log  c  is  generally 
sufficient,  and  that  nothing  is  gained  by  computing  this  constant  to 
several  places  of  decimals. 

It  may  at  first  sight  appear  inconsistent  with  the  general  theory 
to  adopt  values  of  the  three  constants  which  do  not  make  the  third 
sum  vanish  ;  i.e.,  the  third  moment  of  the  graduated  and  ungraduated 
figures  identical.  It  must,  however,  be  remembered  that  the  method 
of  least  squares  (and  with  it  the  method  of  moments)  assumes  that 
the  form  of  the  curve  is  known  a  primi,  in  which  case  the  method 
gives  the  means  of  determining  the  most  probable  values  of  the 
constants  involved.  When,  however,  we  are  dealing  with  a 
mortality  experience,  wc  have  no  a  priori  right  to  assume  that 
Makcham's  law  is  strictly  applicable ;  and,  if  it  is  not,  the 
deA^ations  instead  of  following  the  normal  law  as  assumed  in  the 
theory  of  least  squares.  Mill  include  systematic  dcAaations  due  to 
departure  from  the  Makeham  law.  In  these  circumstances  the 
method  of  least  squares  is  not  strictly  applicable,  and  we  are 
therefore  justified  in  allowing  other  considerations  to  guide  us  in 
selection  of  the  constants. 


We  may  here  note  that  if  the  exposures  are  represented  by  a 
fiequency  curve,  the  deaths  being  recomputed  to  correspond  to  the 
graduated  exposures,  then  the  value  of  logc  may,  in  general,  be 
calculated  from  the  moments  of  the  exposures  and  of  the  recomputed 
deaths.     This  can  readily  be  done  if  the  exposures  are  represented 
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by  a  binomial  curve  {see  Calderon,  J.I. A.,  xxxv,  157/,  although 
precautions  must  be  taken  so  to  group  data  that  the  number  of 
terms  in  the  binomial  is  not  great — not  more,  say,  than  five  or  six  ; 
or  by  the  normal  frequency  curve  {see  Elderton's  Frequency 
Curves",  pp.  98-100);  or  by  the  curve  y  =  kx"'e-'<'',  where,  if  Eq,  Ej, 
&c.,  represent  the  successive  moments  for  the  exposures  round  the 
origin,  and  Oq,  Oi,  &c.,  the  similar  moments  of  the  recomputed 
deaths, 

,      „    ,  y-log^C  \Ei  Eq/ 

we  shall  have  ' ^^^  =  —t= ~ 

y         (h  _  Ml 

VEo      E]/ 

whence,  y  being  known,  loggC  is  easily  found. 

The  above  relation  may  be  thus  demonstrated.  The  force  of 
mortality  at  age  x  is  assumed  to  be  of  the  form  A  +  Bc^  =  A  +  Be"'^"^" 
=  A  +  Be''^,  putting  A  =  loggC.  Thus  the  death  curve  will  be  of  the 
form  K.hx™e~'*'^  +  '^.hx^"'e~^'^~^^^ ,  where  the  second  term  is  of  the  same 
form  as  the  first  with  7  -  A  substituted  for  7.  But  by  the  well- 
known  properties  of  the  Gamma  integral  {see  Williamson's 
"Integral  Calculus",  Art.  120)  we  have  • 


y-CO  /-Xi 

Jo  zJ  0 


whence  it  is  easily  seen  that,  writing  E'o,  E'l  .  .  .  for  the  moments 
of  kx'"e-^y-^>\ 


Eo  =  Eo  ^o  =  AEo  +  BE' 

7-A 
(7 -A) 


E,=.Eox^-^^±i  (^,  =  AE,+BEV"^J 


7 


E,  =  Eo  X  ^"'  +  -^^:^^+^^  e,  =  AE,  +  BE'o ^"^  +  ^^^^^  +  ^> 


7 
whence 

^o^Eo  =  A  +  b|-0  =A  +  B' 

Eo 


6^i-Ei  =  A  +  b|-0-^       =A  +  B'^^ 


A  =  B'     ^ 


7  —  A 


Eo  7  -  A  7  -  A 


Ai  =  B' 


7A 


^2-^Eo==A  +  B|-0^^,=A  +  B'(-^ 
En  (-/  -  Ar  \y  - 


(7-A)^ 


Eo  (7  -  A)"-^  \7  -  A 

so  that  A  -^  Ai 


7-A  _  7— ^Qge' 
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If  the  exposures,  as  often  happens,  can  only  be  represented  by 
a  curve  of  the  form  y  =  t<;*(l  -  ,c)^  (where  x  represents  a  propor- 
tionate part  of  the  range  of  the  curve  so  that  .'■  ranges  between  0 
and  l)  and  if,  as  before,  we  represent  the  successive  moments  for 
exposures  and  deaths  by  jiiq,  nii,  &c.,  Mq,  Mj,  &c.,  where  hiq  and  Mq  are 
made  =  l,  then  writing 

(a+l)-(a  +  /i  +  2)Mi  =  Eo 

(a  +  l)Mi  -  (a  +  fi  +  2)Mo  =  Ri 

it  will  be  found  that,  putting  /•  for  the  range  of  the  curve  in  years 
of  age, 


^  Ro 

'^"^      (Mo  -  Ml)  -  A(;»o  -  wix) 


(Mo  -  Ml, 


(M3  -  Mo)  -  /i(?»3  -  llt.y) 

from  which  as  the  numerical  value  of  all   the    quantities    except 
loge  c  and  h  are  known,  these  two  may  l)e  easily  found. 

This  may  be  shown  as  follows  : — 

Let  the  curve  of  exposed  to  risk  be  represented  by  the  type 

y  =  l:r-{  1  _  ,-y 

where  the  entire  range  of  the  cur\c  is  taken  as  unity,  and  assume 
k,  a,  and  /3  to  be  determined  in  the  usual  manner. 

Let  the  curve  of  the  recomputed  deaths  be  of  the  form 

Akx''{l-x)^  +  'BhHl-.r)^>>y'=^Ay  +  Bz     .     .     .     (l) 

i.e.,  we  assume  that  — ,    (log  Ix)  =  A  +  Be^^ 

ax 

As  regards  the  curve  :',  we  shall  have 

log  Z  =  a  log;/,-  +  /i  log  ( 1  -  ./;)  +  yX 

dz       fa  (i 


dx      \x      \  -X 


or,  multiplying  1)0th  sides  by  x''^^{l  -,/■), 

(.//+^-.//+'^)t  =  [«../- (a +  iy).V  +  '+7(./+'-./+^)>    .      .     (2) 
(IX 
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Integrating  the  left-hand  side  of  this   equation  by  parts,  and 
noting  that  the  factor  {x^'^^  -  a:*"*"^)  is  zero  for  the  limits  1  and  0, 


Jo  .'0 

that  is 

(t  +  '2)m't+i  -{f  +  l)iii't  =  am't  -  (a  +  ft)m,+i  +  y{mt+\  - ''nt+'i) 
and 

{a  +  t+\)m't-{o.  +  (i  +  t  +  2)int+i  +  y{iii^t+i-m't+o}^0     .     .  (3) 

there    mt  represents  the  /th  moment  of  the   curve  z  round  the 
ordinate  ,'■  =  0. 

If  7  =  0,  the  curve  ';  becomes  identical  with  y,  and  Avriting  mt 
for  the  t\\\  moment  of  y  round  the  ordinate  x  =  0,  we  have 

{u.  +  t+\)iiit-{a.  +  ft  +  f  +  2)mt+i  =  0     ....     (4) 

Write,  as  before,  the  total  of  the  exposed  =Eo,  and  of  the 
deaths  =  d^^,  respectively,  and  represent  the  total  of  the  exposed 
multiplied  at  each  age  by  the  factor  c'^^  by  E'q. 

Let  E<  and  Ot  be  the  tth.  moments  of  the  cur^-e  of  exposed  to 
risk  and  of  the  recomputed  deaths,  the  areas  of  the  curves  not 
being  taken  as  =1,  ]>ut  having  the  values  Eq  and  ^o  above  defined, 
that  is  to  say,  representing  the  total  exposures  and  the  total  deaths. 
And  let  E'?  be  the  it\\  moment  of  the  curve  of  exposures  multiplied 
at  each  age  liy  v"^'' . 

Then  Ave  have  ^,  =  AE,  +  BE'^   ........     (5) 

where  9t  and  E,>  are  known,  but  the  remaining  quantities  unknown. 

From  (3)  and  (4) 

(a  +  ^+l)E,-(a  +  /i4-/  +  2)E,+i=0 (6) 

and     (a  +  /+l)E;-(«  +  /:/  +  /  +  2)E',+i  +  7(E'(+i-E',+2)  =  0     .     (7) 

Write  {a  +  t  +  l)et-  {o.  +  li  +  i  +  i)et+i  =  If? , 

from  (5) 

(a  +  /  +  l)(AE,  +  BEV)-(a  +  /3  +  /  +  2)(AE,+i  +  BE',+i)  =  K? 

and  from  (6) 

(a  +  /  +  1)BE',  -  (a  +  /3  +  ^  +  2)BE',+i  =  E, 

and  from  (7) 

■     '  By(EV2-E'm)  =  Rf (8) 
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Since  from  (5) 

we  have  y[(^,+,- ^,+1)  -  A(E,+o-Em)]  =  R«  •     ...     (9) 

writing  ^  =  0  and  ^  =  1  respectively,  we  get 

y[(^,-^i)-A(E2-Ea)]-Ro 

whence       Ei[(^o  -  ^1)  -  ACE^  -  Ej)]  =  Ro[(^3  -  ^2)  -  A(E3  -  E.,)] 

and  A=  3ii^-2_-Lli)-_^d^^_-^-2) (10) 

Ki(E2  —  El)  —  RolEa  —  Eo) 

also,  from  (9)  ^=^__^-^^^_^ 

The  A'ahie  of  B  cannot  he  determined  directly  from  these  equations 
as  it  enters  symmetrically  with  the  values  of  E'^.  It  is  therefore 
necessary,  having  found  the  value  of  y,  to  compute  the  Aalue  of  E'o 
and  thence  deduce  B  from  equation  (5). 

Unless  the  mortality  follows  Makeham's  law  very  closely  better 
results  will  be  obtained  by  calculating  both  E',,  and  E'x  and  obtaining 
values  of  A  and  B  satisfying  the  equations 

AEo  +  BE'o-^,1 
AEi  +  BE'i  =  6^j 
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Tables  of  Values  of  y- 


JttJ  0 


.dx. 


z 

y 

A 

z 

y 

A 

s 

y 

A   i 

•01 

•01128  i 

1128 

•51 

•52924 

866 

1-01 

•84681 

i 

403 

•02 

•02256  1 

1128 

•52 

•53790 

856 

1^02 

•85084 

394 

•03 

•03384  i 

1127 

•53 

•54646 

848 

1-03 

•85478 

387 

•04 

•04511  i 

1126 

•54 

•55494 

838 

1-04 

•85865 

379 

•05 

•05637  1 

1 

1125 

•55 

•56332 

830 

1^05 

•86244 

370 

•06 

•06762  . 

1124 

•56 

•57162 

820 

1-06 

•86614  1 

363 

•07 

•07886  1 

1122 

•57 

•57982 

810 

107 

•86977 

356 

•08 

•09008 

1120 

•58 

•58792 

802 

1-08 

•87333 

347 

•09 

•10128 

1118 

•59 

•59594 

792 

1-09 

•87680  \ 

341 

•10 

•11246 

1116 

•60 

•60386 

782 

1-10 

•88021 

332 

•11 

•12362  1 

1114 

•61 

■61168 

773 

1-11 

•88353 

326 

•12 

•13476 

1111 

•62 

•61941 

764 

1-12 

•88679 

318 

•13 

•14587 

1108 

•63 

•62705 

754 

113 

•88997 

311 

•14 

•15695 

1105 

•64 

•63459 

744 

114 

•89308 

304 

•15 

•16800 

1101 

•65 

•64203 

735 

115 

•89612 

298 

•16 

•17901 

1098 

•66 

■64938 

725 

116 

•89910 

290 

•17 

•18999 

1094 

•67 

•65663 

715 

1-17 

•90200 

284 

•18 

•20093 

1091 

•68 

•66378 

706 

1^18 

•90484 

278 

•19 

•21184 

1086 

•69 

■67084 

696 

1^19 

•90761 

270 

•20 

•22270 

1082 

•70 

■67780 

687 

1^20 

•91031 

265 

•21 

•23352 

1078 

•71 

■68467 

676 

121 

•91296 

257 

•22 

•24430 

1072 

•72 

•69143 

667 

1^22 

•91553 

252 

•23 

•25502 

1068 

•73 

•69810 

658 

1-23 

•91805 

246 

•24 

•26570 

1063 

•74 

•70468 

648 

124 

•92051 

239 

•25 

•27633 

1057 

•75 

•71116 

638 

125 

•92290 

234 

•26 

•28690 

1052 

•76 

•71754 

629 

126 

•92524 

227 

•27 

•29742 

1046 

■77 

•72382 

619 

127 

•92751 

222 

•28 

•30788 

1040 

•78 

■73001 

609 

128 

•92973 

217 

•29 

■31828 

1035 

•79 

•73610 

600 

1^29 

•93190 

211 

•30 

•32863 

1028 

■80 

•74210 

590 

130 

•93401 

■ 

205 

•31 

•33891 

1022 

•81 

■74800 

581 

1-31 

•93606 

201 

•32 

•34913 

1015 

•82 

•75381 

571 

132 

•93807 

195 

•33 

•35928 

1008 

•83 

•75952 

562 

133 

1  •94002 

189 

•34 

•36936 

1002 

•84 

•76514 

553 

1^34 

•94191 

185 

•35 

•37938 

995 

•85 

•77067 

543 

135 

•94376 

180 

•36 

•38933 

988 

•86 

•77610 

534 

136 

•94556 

175 

•37 

•39921 

980 

•87 

•78144 

525 

137 

•94731 

171 

•38 

•40901 

973 

•88 

:  -78669 

515 

138 

•94902 

165 

•39 

•41874 

965 

•89 

1  -79184 

507 

139 

•95067 

162 

•40 

•42839 

958 

•90 

•79691 

497 

140 

•95229 

156 

•41 

•43797 

950 

•91 

1  •80188 

489 

1-41 

,  •  95385 

153 

•42 

•44747 

942 

•92 

•80677 

'  479 

1-42 

•95538 

148 

•43 

•45689 

934 

•93 

•81156 

471 

143 

•95686 

144 

•44 

•46623 

925 

•94 

•81627 

462 

1^44 

•95830 

140 

•45 

•47548 

918 

•95 

•82089 

453 

145 

j  ^95970 

135 

•46 

•48466 

909 

■96 

•82542 

445 

1^46 

!   96105 

132 

•47 

•49375 

900 

•97 

•82987 

436 

1^47 

•96237 

128 

•48 

•50275 

892 

•98 

;  ^83423 

428 

1^48 

•96365 

125 

•49 

•51167 

883 

•99 

1  ^83851 

419 

1-49 

•96490 

121 

•50 

•52050 

874 

1-00 

•84270 

411 

150 

•96611 

117 

139 


Table 


2    f~ 
of  Values  of  y=  ^j-\   e~^.dx — continued. 

JttJ  0 


z 

^ 

A 

1 

y 

A 

<v 

y 

A 

1-51 

•96728 

117 

2^01 

•995525 

195 

251 

•9996143 

202 

1-52 

•96841 

111 

2^02 

•995720 

186 

2^52  : 

•9996345 

192 

1-53 

•96952 

107 

203 

•995906 

180 

253 

•9996537  ' 

183 

1-54 

•97059 

103 

204 

•996086 

172 

254 

•9996720  1 

173 

1-55 

•97162 

101 

205 

•996258 

165 

255 

•9996893  ' 

165 

1-56 

•97263 

97 

206 

•996423 

159 

256 

•9997058 

157 

,  1-57 

•97360 

95 

2-07 

•996582 

152 

2-57 

•9997215 

149 

1-58 

■97455 

91 

2-08 

•996734 

146 

2^58  ; 

•9997364 

141 

1-59 

•97546 

89 

2  09 

•996880 

141 

2^59 

•9997505 

135 

1-60 

•97635 

86 

2^10 

•997021 

134 

2-60 

•9997640 

127 

1-61 

•97721 

83 

2^11 

•997155 

129 

2^61 

•9997767 

121 

1-62 

•97804 

80 

212 

•997284 

123 

2^62 

•9997888 

115 

1-63 

•97884 

78 

2-13 

•997407 

118 

2^63 

•9998003 

109 

'  1-64 

•97962 

76 

2^14 

■997525 

114 

364 

•9998112 

103 

1-65 

•98038 

72 

215 

•997639 

108 

265 

•9998215 

98 

1-66 

•98110 

71 

216 

•997747 

104 

2^66 

•9998313 

93 

1-67 

•98181 

68 

2^17 

■997851 

100 

267 

•9998406 

88 

1-68 

•98249 

66 

2^18 

•997951 

95 

2^68 

•9998494 

84 

:  1-69 

•98315 

64 

219 

•998046 

91 

2-69 

•9998578 

79 

1-70 

•98379 

62 

2-20 

•998137 

87 

2^70 

•9998657 

75 

!  1-71 

•98441 

59 

2^21 

•998224 

84 

2-71 

•9998732 

71 

1-72 

•98500 

58 

2^22 

•998308 

80 

2-72 

•9998803 

67 

1-73 

•98558 

55 

223 

•998388 

76 

273 

•9998870 

63  ' 

■  1-74 

•98613 

54 

2^24 

•998464 

73 

274 

•9998933 

61  i 

1  1-75 

•98667 

52 

2-25 

•998537 

70 

275 

•9998994 

57  : 

;  1-76 

•98719 

50 

226 

•998607 

67 

2^76 

•9999051 

54 

;  1-77 

•98769 

48 

2^27 

•998674 

64 

2^77 

•9999105 

51 

1  1-78 

•98817 

47 

2^28 

•998738 

61 

2-78 

•9999156 

48  ' 

!  1-79 

•98864 

45 

229 

•998799 

58 

2-79 

•9999204 

46  ; 

1-80 

•98909 

43 

2-30 

•998857 

55 

2-80 

•9999250 

43  \ 

1-81 

•98952 

42 

2-31 

•998912 

53 

2-81 

•9999293 

41  i 

1-82 

•98994 

41 

232 

•998965 

51 

2^82 

•9999334 

38  1 

1-83 

•99035 

39 

2-33 

•999016 

49 

2  83 

•9999372 

37 

1-84 

•99074 

37 

234 

•999065 

46 

2^84 

•9999409 

34 

1-85 

•99111 

36 

2-35 

•999111 

44 

2^85 

•9999443 

33 

1-86 

•99147 

35 

2^36 

•999155 

42 

2^86 

•9999476 

31 

1-87 

•99182 

34 

2-37 

•999197 

40 

2-87 

•9999507 

29 

1-88 

•99216 

32 

2-38 

•999237 

38 

2^88 

•9999536 

27  ' 

1-89 

•99248 

31 

2-39 

•999275 

36 

2^89 

•9999563 

26 

1-90 

•99279 

30 

2^40 

•999311 

35 

290 

•9999589 

109 

1-91 

•99309 

29 

241 

•999346 

33 

295 

•9999698 

81  ' 

1-92 

•99338 

28 

242 

•999379 

32 

300 

•9999779 

60  1 

1-93 

•99366 

26 

243 

•999411 

30 

3-05 

•9999839 

45 

1-94 

•99392 

26 

2^44 

•999411 

28 

3^10 

•9999884 

32 

1-95 

•99418 

25 

2^45 

•999469 

28 

315 

•9999916 

24 

1-96 

•99443 

23 

2  4-6 

•999497 

26 

320 

■9999940 

29 

1-97 

•99466 

23 

247 

•999523 

24 

330 

•9999969 

16 

1-98 

•99489 

22 

2^4« 

•999547 

24 

340 

•9999985 

8 

1-99 

•99511 

21 

2^49 

•999571 

22 

350 

•9999993 

3 

2-00 

•99532 

20 

2-50 

•999593 

21 

3^60 

•9999996 
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Table  of 

[The  constants  are  restricted  to  positive  quantities  of  significant  value 


Type 

I 

II 

Character  of  Curve 

Equation  y  = 

Limits  of  x 

Mean 

M2 

Shape 

Range 

Limited 
both 
ways 

Lower 

Upper 
+  a 

Ms 

Symmetrical 

"-1 

0 

a2 

0 

lc{a^-^' 

»  +  l 

Symmetrical 

Un- 
limited 

Jce  ~  ^'''  (Normal  Curve) 

—  00 

+  <X)    1 

0 

w 

0 

III 

Symmetrical 

Un- 

j'„2 ,  „2^  ""I'l  "^^1 

1 

0 

a" 

0 

limited 

Limited 
both 
ways 

k{a~  +  X-)     \-       1           —00+00 
(m>3)                  I 

u-1 

IV 
V 

Skew 

(p  +  2  =  l) 

+  a 

{q-p)a 

M+1 

ma^ 

16{p-q)pq 
(M  +  i)(w  +  2) 

Skew 

Limited 
one  way 

m-1     -  - 
kx          e 

0 

-t-oo 

ma 

2ma^ 

VI 

VII 

VIII 

Skew 
Skew 

Limited 
oue  way 

(«>3) 

+  a 

+  00 

(p+q)a 

w  — 1 

«2 

I6(p  +  q)pq 
(«-l)(«-2) 

Limited 
one  way 

a 
(M>3) 

0 

+  05 

a 

n 

4a-' 

n'{n-l.) 

n\n  —  l){n- 

Skew 

Un- 
limited 

•                  (»  >  3) 

—  00 

+  00 

pa 
n 

n"(n-l) 

i 

Notes  :— /3i  =  /a/-^M/.     ^82=^4-^ Ma"' 
Skewness  =  (Mean — mode)  -h  <r 
/3i 


Criterion  =  K  = 


4(2 -37)  (4 -37) 
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Frequency  Curves. 

(i.e.,  all   >0),  and   in   Types   III,  VI,  VII   and   VIII,   n   must   be    >3]. 


M4 


3o^ 

(»  +  l)(?j  +  3) 

|o^  =  3mo- 

3a^ 

(»-lX»-3) 

^pq{2  +  n-Qpq)    ^ 

■-    •    ..  3 


^1=M3'-^M: 


(«  +  1)(»  +  'l){n  +  3) 


3»»(m  +  2)a-* 


4(»  +  l)(;>-g)^ 


48^5'(2  +  w  +  6pj)    ^4 
(»-l)(7i-2)(«-3) 


Not  required 


j1 


{n-2Ypq 


16(»-1) 


)8a  +  3 


2  w  +  3 

3  »  +  2 

2 

>  _ 

3 


V»-3J 


2 
3 


2  »-3 

3  n  —  '2i 

2 
^3 


2   »  +  3 


3[2  +  (?t-6)^g](»  +  l) 
(»  +  2)(m  +  3)^2 


'  3   n  +  2, 


\^pq        ) 


3 

(negative) 

2 
3 

CO 

2    ?j-3 


4(m-1)(p  +  ?)'-  ^        3[2  +  (»  +  6)/?7](n-l)        |  3  m-2 
(»  — :i)('*~'^)i'2 


I     <- 
3 


2 

w  — 

3 

3 

»  — 

2 

"3 

2 

2 

M  — 

3 

„2^^?)[(„  +  6)(»2  +  ;/=)-8«'^]     ^  16(»-1)       V-       3(w-l)[(»  +  6)(>r  +  y-)-8»'^]     3  »-2  -pj— 7 

n^(»-l)(»-2X»-3)  "     (»-2)-  'm-^^-  {n-'l){n-Z){n-  +  v-)         \      ^   2      ,       > q    I  1 


Standard  Deviation  =  '^ jj.>  =  o- . 

^gi(/3,  +  3) 
2(5/3.: -«)3i- 9)' 

/3,()3..  +  3)2 


4(2/32-3/3i-6j(4)8,-3/3i) 
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